MNase-seq是micrococcal nuclease digestion with deep sequencing(微球菌核酸酶消化结合深度测序)的缩写,是2008年以来用于检测人类基因组上核小体占用情况的分子生物学技术。在2009年才定名“MNase-seq”。简而言之,该技术依赖于来自金黄色葡萄球菌的非特异性核酸内外切酶微球菌核酸酶,用其结合和切割染色质上未结合蛋白质的DNA区域。与此同时,此种没不消化与组蛋白或其他染色质结合蛋白(例如转录因子)结合的DNA。然后从蛋白质中纯化未切割的DNA,并通过一种或多种不同的下一代测序方法进行测序。
MNase-seq是用于通过分析染色质可及性来评估表观基因组状态的四类方法之一。其他三种技术是DNase-seq、FAIRE-seq和ATAC-seq。MNase-seq主要用于对由组蛋白或其他染色质结合蛋白结合的DNA区域进行测序,其他三种测序的对象通常分别是:脱氧核糖核酸酶I过敏位点(DHS)、测未结合的DNA的染色质蛋白质和松散包裹的染色质区域(通过标记物的转座测定)。
1956年,首次在金黄色葡萄球菌中发现微球菌核酸酶,蛋白质于1966年结晶检测,特征于1967年被阐明。染色质的MNase消化是染色质结构早期研究的关键;用于确定染色质的每个核小体单位由大约200bp的DNA组成。这与奥林斯和奥林斯的“串珠”模型一起,证实了科恩伯格关于基本染色质结构的设想。进一步研究发现,MNase无法降解短于~140bp的与组蛋白结合的DNA,而DNase I和II可以将结合的DNA降解至低至10bp。这最终阐明了核小体核心由~146bp的DNA包裹着,~50bp的接头DNA连接每个核小体,并且10个连续的DNA碱基对间隔地与核小体的核心紧密结合。
微球菌核酸酶消化自特征于1967年被阐明以来,除了用于研究染色质结构外,一直用于寡核苷酸测序实验。由于MNase优先消化腺嘌呤和胸腺嘧啶丰富的区域,此技术用来分析无染色质序列,例如酵母(酿酒酵母)线粒体DNA以及噬菌体DNA。在1980年代初期,MNase消化被用于确定成熟SV40、果蝇(黑腹果蝇)、酵母、和猴子等染色体的核小体定相和相关DNA。1985年,第一次使用这种消化来研究染色质可及性与人类基因表达的相关性。在这项研究中,核酸酶用于检测某些致癌序列与染色质和核蛋白的关联。在没有测序或阵列信息的情况下,利用MNase消化来确定核小体定位的研究一直持续到2000年代初期。
随着1990年代末和2000年代初全基因组测序的出现,将纯化的DNA序列与酿酒酵母、秀丽隐杆线虫、黑腹果蝇、拟南芥、小鼠和人类的真核基因组进行比较成为可能。MNase消化首先被应用于酿酒酵母和秀丽隐杆线虫的全基因组核小体占据研究。MNase消化处理后,通过微阵列进行分析,确定哪些DNA区域富含MNase抗性核小体。基于MNase的微阵列分析通常用于酵母的全基因组范围和人类的有限基因组区域以确定核小体定位,这可用作转录失活的推断。
2008年开发出下一代测序时,MNase消化与高通量测序(即Solexa/Illumina测序)相结合以研究人类全基因组范围内的核小体定位。一年后,术语“MNase-Seq”和“MNase-ChIP”最终被创造出来,用于染色质免疫沉淀的微球菌核酸酶消化。自2008年首次应用以来,MNase-seq已被用于对与核小体占据和跨真核生物表观基因组学相关的DNA进行深度测序。截至2020年2月,MNase-seq仍用于测定染色质可及性。
Chromatin is dynamic and the positioning of nucleosomes on DNA changes through the activity of various transcription factors and remodeling complexes, approximately reflecting transcriptional activity at these sites. DNA wrapped around nucleosomes are generally inaccessible to transcription factors. Hence, MNase-seq can be used to indirectly determine which regions of DNA are transcriptionally inaccessible by directly determining which regions are bound to nucleosomes.
In a typical MNase-seq experiment, eukaryotic cell nuclei are first isolated from a tissue of interest. Then, MNase-seq uses the endo-exonuclease micrococcal nuclease to bind and cleave protein-unbound regions of DNA of eukaryotic chromatin, first cleaving and resecting one strand, then cleaving the antiparallel strand as well. The chromatin can be optionally crosslinked with formaldehyde. MNase requires Ca2+ as a cofactor, typically with a final concentration of 1mM. If a region of DNA is bound by the nucleosome core (i.e. histones) or other chromatin-bound proteins (e.g. transcription factors), then MNase is unable to bind and cleave the DNA. Nucleosomes or the DNA-protein complexes can be purified from the sample and the bound DNA can be subsequently purified via gel electrophoresis and extraction. The purified DNA is typically ~150bp, if purified from nucleosomes, or shorter, if from another protein (e.g. transcription factors). This makes short-read, high-throughput sequencing ideal for MNase-seq as reads for these technologies are highly accurate but can only cover a couple hundred continuous base-pairs in length. Once sequenced, the reads can be aligned to a reference genome to determine which DNA regions are bound by nucleosomes or proteins of interest, with tools such as Bowtie. The positioning of nucleosomes elucidated, through MNase-seq, can then be used to predict genomic expression and regulation at the time of digestion.
Recently, MNase-seq has also been implemented in determining where transcription factors bind on the DNA. Classical ChIP-seq displays issues with resolution quality, stringency in experimental protocol, and DNA fragmentation. Classical ChIP-seq typically uses sonication to fragment chromatin, which biases heterochromatic regions due to the condensed and tight binding of chromatin regions to each other. Unlike histones, transcription factors only transiently bind DNA. Other methods, such as sonication in ChIP-seq, requiring the use of increased temperatures and detergents, can lead to the loss of the factor. CUT&RUN sequencing is a novel form of an MNase-based immunoprecipitation. Briefly, it uses an MNase tagged with an antibody to specifically bind DNA-bound proteins that present the epitope recognized by that antibody. Digestion then specifically occurs at regions surrounding that transcription factor, allowing for this complex to diffuse out of the nucleus and be obtained without having to worry about significant background nor the complications of sonication. The use of this technique does not require high temperatures or high concentrations of detergent. Furthermore, MNase improves chromatin digestion due to its exonuclease and endonuclease activity. Cells are lysed in an SDS/Triton X-100 solution. Then, the MNase-antibody complex is added. And finally, the protein-DNA complex can be isolated, with the DNA being subsequently purified and sequenced. The resulting soluble extract contains a 25-fold enrichment in fragments under 50bp. This increased enrichment results in cost-effective high-resolution data.
Single-cell micrococcal nuclease sequencing (scMNase-seq) is a novel technique that is used to analyze nucleosome positioning and to infer chromatin accessibility with the use of only a single-cell input. First, cells are sorted into single aliquots using fluorescence-activated cell sorting (FACS). The cells are then lysed and digested with micrococcal nuclease. The isolated DNA is subjected to PCR amplification and then the desired sequence is isolated and analyzed. The use of MNase in single-cell assays results in increased detection of regions such as DNase I hypersensitive sites as well as transcription factor binding sites.
MNase-seq is one of four major methods (DNase-seq, MNase-seq, FAIRE-seq, and ATAC-seq) for more direct determination of chromatin accessibility and the subsequent consequences for gene expression. All four techniques are contrasted with ChIP-seq, which relies on the inference that certain marks on histone tails are indicative of gene activation or repression, not directly assessing nucleosome positioning, but instead being valuable for the assessment of histone modifier enzymatic function.
As with MNase-seq, DNase-seq was developed by combining an existing DNA endonuclease with Next-Generation sequencing technology to assay chromatin accessibility. Both techniques have been used across several eukaryotes to ascertain information on nucleosome positioning in the respective organisms and both rely on the same principle of digesting open DNA to isolate ~140bp bands of DNA from nucleosomes or shorter bands if ascertaining transcription factor information. Both techniques have recently been optimized for single-cell sequencing, which corrects for one of the major disadvantages of both techniques; that being the requirement for high cell input.
At sufficient concentrations, DNase I is capable of digesting nucleosome-bound DNA to 10bp, whereas micrococcal nuclease cannot. Additionally, DNase-seq is used to identify DHSs, which are regions of DNA that are hypersensitive to DNase treatment and are often indicative of regulatory regions (e.g. promoters or enhancers). An equivalent effect is not found with MNase. As a result of this distinction, DNase-seq is primarily utilized to directly identify regulatory regions, whereas MNase-seq is used to identify transcription factor and nucleosomal occupancy to indirectly infer effects on gene expression.
FAIRE-seq differs more from MNase-seq than does DNase-seq. FAIRE-seq was developed in 2007 and combined with Next-Generation sequencing three years later to study DHSs. FAIRE-seq relies on the use of formaldehyde to crosslink target proteins with DNA and then subsequent sonication and phenol-chloroform extraction to separate non-crosslinked DNA and crosslinked DNA. The non-crosslinked DNA is sequenced and analyzed, allowing for direct observation of open chromatin.
MNase-seq does not measure chromatin accessibility as directly as FAIRE-seq. However, unlike FAIRE-seq, it does not necessarily require crosslinking, nor does it rely on sonication, but it may require phenol and chloroform extraction. Two major disadvantages of FAIRE-seq, relative to the other three classes, are the minimum required input of 100,000 cells and the reliance on crosslinking. Crosslinking may bind other chromatin-bound proteins that transiently interact with DNA, hence limiting the amount of non-crosslinked DNA that can be recovered and assayed from the aqueous phase. Thus, the overall resolution obtained from FAIRE-seq can be relatively lower than that of DNase-seq or MNase-seq and with the 100,000 cell requirement, the single-cell equivalents of DNase-seq or MNase-seq make them far more appealing alternatives.
ATAC-seq is the most recently developed class of chromatin accessibility assays. ATAC-seq uses a hyperactive transposase to insert transposable markers with specific adapters, capable of binding primers for sequencing, into open regions of chromatin. PCR can then be used to amplify sequences adjacent to the inserted transposons, allowing for determination of open chromatin sequences without causing a shift in chromatin structure. ATAC-seq has been proven effective in humans, amongst other eukaryotes, including in frozen samples. As with DNase-seq and MNase-seq, a successful single-cell version of ATAC-seq has also been developed.
ATAC-seq has several advantages over MNase-seq in assessing chromatin accessibility. ATAC-seq does not rely on the variable digestion of the micrococcal nuclease, nor crosslinking or phenol-chloroform extraction. It generally maintains chromatin structure, so results from ATAC-seq can be used to directly assess chromatin accessibility, rather than indirectly via MNase-seq. ATAC-seq can also be completed within a few hours, whereas the other three techniques typically require overnight incubation periods. The two major disadvantages to ATAC-seq, in comparison to MNase-seq, are the requirement for higher sequencing coverage and the prevalence of mitochondrial contamination due to non-specific insertion of DNA into both mitochondrial DNA and nuclear DNA. Despite these minor disadvantages, use of ATAC-seq over the alternatives is becoming more prevalent.