IBM spectral calculations allow gene sequencing to enter the homes of ordinary people

Recently, Nature published an article entitled "Diagnosis: Aclear answer" about how gene sequencing has allowed a woman who has been misdiagnosed for 30 years to be diagnosed. Under the guidance of the results of gene sequencing, she took drugs that target specific mutations and her condition was alleviated. This story shows that genetic sequencing has the potential to change the lives of many people.

In the past few years, gene sequencing technology has attracted widespread attention. The Forbes magazine website published an article in 2014 that listed eight of China's leading industries in innovation, one of which is gene sequencing. With the release of the precision medicine program, the heat of gene sequencing continues to soar, and the value of gene sequencing technology is immeasurable. Through the analysis of genes, we can predict the probability of cancer, and can also give accurate prevention and disease treatment programs in advance.

Nowadays, our genetic sequencing related products and technologies have actually evolved from laboratory research to clinical use. It can be said that gene sequencing technology is the next technology to change the world. However, after a period of high fever, gene sequencing encountered bottlenecks and faced with limitations in computing power, this technology is difficult to benefit the public. Moreover, a person's fully sequenced genome contains 100-1000 GB of data. The total amount of genomic data for one million people will reach 1 EB (1,000,000 TB). To analyze and analyze such a large amount of data, it poses a great challenge to the computing power of computing devices. Moreover, coupled with the high cost, the technology is still in the stage of enjoying a few “noble” groups. It is still far away from the general public demand and can only be deterred from the “high-up” gene sequencing.

IBM spectral calculations allow gene sequencing to enter the homes of ordinary people

High-throughput, big-data analysis has high computational resource consumption

Gene sequencing is the basis and mainstream technology of gene detection. From the perspective of the sequencer, the first generation of sequencing technology is mainly Sanger sequencing, with high accuracy, sequencing read length of up to 1000bp, accuracy rate of up to 99.999%, but low sequencing throughput And expensive, seriously affecting its true large-scale application. Second Generation Sequencing (NGS), mainly used in Roche/454 FLX, Illumina/Solexa Genome Analyzer and Applied Biosystems SOLID system, the biggest advantage is that the cost is much lower than that of the first generation. Improvement, but the disadvantage is that the introduced PCR process will increase the error rate of sequencing to a certain extent, and has system bias, and the read length is also short. But the main problem with Illumina is that the sequencing length is short, and the error rate above 100 bp is greatly improved. The sequencing length of the Roche/454 FLX, Applied Biosystems SOLID system can be longer, but at a slightly higher cost. Short sequence reads can be very cumbersome when encountering a large number of repeats when doing gene assembly. The third-generation sequencing technology, the so-called single-molecule sequencing, can measure very long lengths, but introduces indel (insertion, deletion) that is rare in second-generation sequencing.

From the perspective of data analysis, the analysis of high-throughput and big data is very high in the storage of computers and the consumption of computing resources. DNA is extracted from the laboratory, but the analysis results are finally obtained. In the middle, a series of experiments and data analysis processes such as “building a library-sequencing-comparison/assembly-variation detection-annotation” are required. The second-generation sequencing technology is performed on a single sample. The amount of data in T will make the analysis process time consuming and resource intensive.

Since the genetic industry is a relatively new industry, the standards of individual companies are difficult to unify. In the analysis process, there are many steps, and each step will contain many analysis scripts, system commands and external tools. The tools will be manually deployed to the computing cluster repeatedly, which will make the analysis process more complicated. As the cost of genome sequencing decreases, the amount of data that is being sequenced continues to increase, and this inefficient approach has hampered the development of the genetic industry. The cumbersome command line operation leads to poor interaction performance.

Soap

Hand Liquid Soap,Hand Soap,Hand Sanitizer Gel,Aloe Hand Soap

Wuxi Keni Daily Cosmetics Co.,Ltd , https://www.kenidailycosmetics.com