Key enzymes involved in NGS library construction

High-throughput sequencing, often referred to as next-generation sequencing (NGS) technology, represents a significant leap forward from the initial DNA sequencing methods, such as Sanger sequencing. NGS allows for the simultaneous profiling of hundreds of thousands, if not millions, of nucleic acid molecule sequences. Its merits include exceptional throughput, cost-effectiveness, scalability, and a broad spectrum of applications, establishing it as the predominant sequencing technology worldwide.

The NGS sequencing workflow encompasses four primary phases: sample preparation, library construction, sequencing, and data analysis. Central to library construction is the attachment of standardized NGS platform adapter sequences to both ends of fragmented genomic DNA. This step aims to generate an ample supply of library nucleic acid molecules, prepared for sequencing on the NGS instrument through PCR amplification. Depending on the nature of the sample, NGS library construction can be categorized into DNA library construction and RNA library construction. Enzymes play a pivotal role in these interconnected experiments. So, which key enzymes are involved in the process of library construction?

Next-generation sequencing workflow

Figure 1. Next-generation sequencing workflow[2]

 

1. DNA library construction and its key enzymes
2. RNA library construction and its key enzymes
3. Guideline for NGS core enzymes in DNA & RNA library construction

 

1. DNA library construction and its key enzymes

In the process of DNA library construction, TA clone ligation adapter library construction is the most commonly used technology means at present. The main library construction process is as follows:

DNA library construction process (Illumina)

Figure 2. DNA library construction process (Illumina)

1.1 DNA Fragmentation

The current sequencers typically have a sequencing length in the range of 150-500 base pairs (bp). As a result, it becomes necessary to employ mechanical or enzymatic fragmentation methods to break down large genomic DNA fragments into smaller ones. Mechanical fragmentation can lead to relatively high sample loss and involves a more intricate operational process. On the other hand, enzymatic digestion is a commonly used method for fragmenting genomic DNA. In comparison to mechanical methods, enzymatic digestion is more cost-effective and straightforward, with the reaction only requiring a set period after the addition of the fragmentation enzyme.

Presently, there are primarily two types of fragments in use. One relies on the Tn5 transposase, based on transposon principles, while the other utilizes a mixture of endonucleases. However, the effectiveness of these fragments can be influenced by the GC content and base preferences of the DNA. In contrast, the fragments developed by Yeasen (Cat#12917) offer a stable digestion effect and exhibit significantly lower site preference compared to Tn5 transposase. They consistently yield excellent sequencing results for various types of DNA samples, including those from FFPE samples.

1.2 End repair, dA-Tailing

The fragmented DNA will generate 5'/3' sticky ends and blunt-end DNA, and all sticky ends need to be converted to blunt ends, including 3' overhangs removed and 5'-protruded DNA ends filled in. When using TA ligation for adapter ligation, the DNA fragment also needs to be phosphorylated at the 5' end and add "A" at the 3' end to be complementary to the adapter with the "T" sticky end. The above process is completed by the cooperation of T4 DNA polymerase, T4 polynucleotide kinase, and Taq DNA polymerase.
T4 DNA polymerase (Cat#12901) has 5'→3' DNA polymerase activity, which can catalyze the synthesis of DNA along the 5'→3' direction and fill in the 5' protruding end. At the same time, the enzyme also has 3'→5' exonuclease activity to cleave 3' overhanging ends, thereby transforming DNA fragments containing sticky ends into blunt-end DNA.
Since the 5' ends of synthetic PCR primers and adapters are usually hydroxyl groups instead of phosphate groups. Therefore, T4 polynucleotide kinase (Cat#12902) is required to catalyze the transfer of ATP γ -phosphate groups to the 5' -hydroxyl end of the oligonucleotide chain in the presence of ATP, in preparation for the next step of adapter ligation.
S-Taq DNA polymerase (Cat#13486) has 5'→3' polymerase activity, which can synthesize DNA from 5'→3' direction. Meanwhile, it has deoxynucleotidyl transferase activity, which can add a nucleotide "A" to the 3' end of the PCR product.

Multiple enzymes involved in the end repair process

Figure 3. Multiple enzymes are involved in the end repair process

Figure 4. S-taq has a very high efficiency of adding A to the four bases of ATCG of the 3' end of the gene segments detected by capillary electrophoresis.

1.3 Adapter ligation

Adapters constitute a crucial component of the library. Within the realm of Illumina sequencing, the commonly employed Y-type adapters encompass P5/P7, Index, and Rd1/Rd2 SP sequences. Among these, the P5/P7 sequence serves the purpose of pairing with the sequence present on the sequencing chip, thereby anchoring the fragments to be analyzed onto the flow cell to execute bridge amplification. The Index sequence is utilized for distinguishing between different samples within the mixed library subjected to sequencing, while Rd1/Rd2 SP denote the regions for binding the Read1 and Read2 sequencing primers.

For the task of adapter ligation, T4 DNA ligase (Cat#12996) is the standard choice. It exhibits the capability to repair single-stranded nicks found in double-stranded DNA and reconnect adjacent nucleotides. 

General adapter ligation process(Illumina)

Figure 5. General adapter ligation process(Illumina)

Figure 6. Verification of T4 DNA Ligase Mutants by Ligating 170-bp DNA with 80-bp Adapters.

1.4 PCR amplification

Obtain enough DNA sequences with adapters through PCR reaction, and complete the sequencing of the sample nucleic acid sequence on the machine. Hieff CanaceTM Pro High-Fidelity DNA Polymerase (Cat#13476) commonly used in PCR has 5'→3' polymerase activity and can synthesize DNA in the 5'→3' direction. In addition, it also has the activity of 3'→5' exonuclease, which can correct the wrong incorporation of bases during the amplification process, to amplify DNA fragments rapidly and with high fidelity.

 

2. RNA library construction and its key enzymes

According to the types of RNA, the construction of an RNA library can be divided into mRNA library, LncRNA library, etc. Conventional RNA library includes the following processes:

mRNA library construction process(Illumina)

Figure 7. mRNA library construction process(Illumina)

2.1 RNA enrichment

Whether dealing with eukaryotes or prokaryotes, ribosomal RNA (rRNA) stands out as the most abundant RNA, constituting up to 80% of the total RNA content. When sequencing the total RNA of a sample directly, a substantial portion of the sequencing data will be related to rRNA. To mitigate this interference, the method of RNA enrichment must be employed. There are two primary methods for this: mRNA enrichment based on oligo-dT and rRNA depletion methods.

In eukaryotes, mRNA exhibits a distinct poly(A) structure at the 3' end. Oligo-dT beads can be employed to capture all mRNA transcribed from the sample, making it suitable for transcriptional analysis, especially with high-quality RNA samples. On the other hand, rRNA depletion methods have more lenient requirements on sample quality and can be applied to both low-quality samples (e.g., FFPE samples) and high-quality RNA samples, as well as prokaryotic samples. The commonly used commercial approach involves the use of RNase H digestion to remove rRNA, following these specific steps:

  1. Synthesize specific oligonucleotide probes designed to bind to rRNA.
  2. Employ RNase H (Cat#12906), which is capable of degrading RNA in the RNA-DNA hybrid strand, to selectively remove the rRNA bound to the probes.
  3. Finally, digest the DNA probes with DNase I (Cat#10325), which can degrade both single- and double-stranded DNA, effectively eliminating rRNA. For more information about DNase I, you can follow this link.

Schematic diagram of enzyme-based rRNA depletion

Figure 8: Schematic diagram of enzyme-based rRNA depletion[5]

2.2 RNA Fragmentation

Usually, under the action of divalent metal cations and high temperature, large fragments of RNA are broken into small fragments.

2.3 1st strand cDNA synthesis

Reverse transcription of the obtained target RNA into the first strand of cDNA. Because RNA is easily degraded by RNases present in the environment, the use of RNase Inhibitor (Cat#14672) during reverse transcription can inhibit the activity of these enzymes and protect RNA from RNase degradation. At the same time, reverse transcriptase (Cat#11112) was used to reverse transcribe the template RNA into cDNA. The reverse transcriptase has RNA-dependent DNA polymerase activity and can use RNA as a template to synthesize a cDNA in the 5'→3' direction. The single strand of DNA is complementary to the RNA template.

During the 1st  strand cDNA synthesis, the incorporation of actinomycin D has undeniably improved the construction of strand-specific libraries, significantly enhancing chain specificity. This innovation has streamlined the experimental process, simplifying it for researchers.

However, actinomycin D does have its drawbacks: it exhibits toxicity and requires protection from light. In today's landscape of increasing demand for premixed and plate library constructing kits, the necessity to shield against light poses limitations on plate kit advancements.

Fortunately, Yeasen ZymeEditor platform has introduced a groundbreaking MMLV enzyme mutant (Inquiry) that replaces the function of actinomycin D. A new Kit (Cat: 12340ES) has been developed with odorless, non-toxic, and no need to avoid light. It offers superior chain specificity, eliminating concerns related to health and light sensitivity.

Figure 9: Engineering of MMLV to identify MMLV mutants which could contribute to Standed RNA-seq 

 

2.4 2nd strand cDNA synthesis

The single-strand cDNA produced through reverse transcription is highly unstable, necessitating the immediate synthesis of the second strand of cDNA under the influence of DNA polymerase I. During this second strand synthesis, RNase H comes into play by removing the RNA strand from the RNA-DNA hybrid structure. It works in concert with DNA polymerase I (Cat#12903) to facilitate the catalytic synthesis of the complementary second strand of cDNA. DNA polymerase I possesses 5'→3' DNA polymerase activity and, guided by a template and primer, synthesizes a sequence that complements the single-strand cDNA in the 5'→3' direction.

The subsequent steps in the process include end repair, dA-Tailing, adapter ligation, and PCR amplification, all of which are detailed in the DNA library construction procedure and need not be reiterated here. It's worth noting that once reverse transcription is completed, there's no need for further fragmentation of the nucleic acid fragment.

3. Guideline for NGS core enzymes in DNA & RNA library construction

Yeasen is a biotechnology company engaged in the research, development, production, and sales of three major biological reagents: molecules, proteins, and cells. Yeasen Biotech company produces a variety of enzymes related to NGS library construction. You can choose the most suitable library construction product from the chart below.

Table 1. Guideline for NGS core enzymes in DNA & RNA library construction

Type

Product positioning

Product name

Cat#

RNA library construction

rRNA depletion/2nd strand cDNA synthesis

RNase H 

12906ES

rRNA depletion

Recombinant DNase I

10325ES

1st strand cDNA synthesis

Murine RNase Inhibitor

14672ES

HifairTM IV Reverse Transcriptase (Inquire)

11112ES

2nd strand cDNA synthesis

DNA polymerase I 

12903ES

RNA library construction & DNA library construction

End repair

T4 DNA Polymerase 

12901ES

T4 Polynucleotide Kinase 

12902ES

dA-Tailing

S-Taq DNA Polymerase

13486ES

Adapter ligation

Quick T4 DNA Ligase

10301ES

PCR amplification

2×Super Canace® II High-Fidelity Mix for Library Amplification

12621ES

 

Table2. DNA & RNA Library Prep Kit

Name Cat# Notes
DNA Hieff NGS DNA Library Prep Kit 13577ES Tumor/ Mechanic method
Hieff NGS OnePot Pro DNA Library Prep Kit V2 12194ES Tumor/ Enzymetic method
Hieff NGS OnePot  II DNA Library Prep Kit for Illumina 13490ES Pathgen/ Enzymetic/ regular time (140min)
Hieff NGS OnePot Flash DNA Library Prep Kit 12316ES Pathgen/ Enzymetic/ Ultrafast  (100min)
Hieff NGS DNA&RNA Library Co-Prep Kit V2 12305ES Pathgen/ Enzymetic/ DNA & RNA Co-Prep
RNA Hieff NGS Ultima Dual-mode mRNA Library Prep Kit  12308ES Without oligo dT magnetic beads, 11 tubes
Hieff NGS Ultima Dual-mode mRNA Library Prep Kit   12309ES oligo dT magnetic beads plus, 14 tubes
Hieff NGS® Ultima Dual-mode RNA Library Prep Kit  12310ES Premixed version, 5 tubes
Hieff NGS ® EvoMax RNA Library Prep Kit(Premixed version)(actinomycin D Free) 12340ES Premixed version, (Actinomycin D Free)
Hieff NGS® MaxUp rRNA Depletion Kit (Plant)  12254ES Plant
Hieff NGS® MaxUp Human rRNA Depletion Kit (rRNA & ITS/ETS) 12257ES Human

 

References:

[1] Mardis, Elaine R. Next-Generation Sequencing Platforms[J]. Annual Review of Analytical Chemistry, 2013, 6(1):287-303.
[2] Gulilat M, Lamb T, Teft W A, et al. Targeted next generation sequencing as a tool for recision medicine[J]. BMC Medical Genomics, 2019, 12(1):81.
[3] Lundberg K S, Dan D S, Adams M, et al. High-fidelity amplification using a thermostable DNA polymerase isolated from Pyrococcus furiosus[J]. Gene, 1991, 108(1):1-6.
[4] Miyazaki K. Random DNA fragmentation with endonuclease V: application to DNA shuffling[J]. Nucleic Acids Research, 2002, 30(24):e139.
[5] Baldwin A, Morris A R, Mukherjee N. An Easy, Cost-Effective, and Scalable Method to Deplete Human Ribosomal RNA for RNA-seq[J]. Current Protocols, 2021, 1(6):e176.