Commercial-scale Genome Center

Complete Genomics’ innovative sequencing technology will be uniquely offered as a service through its own commercial-scale genome center. Customers will have access to large-scale, complete human genomic data analysis without making a major in-house investment in instruments or high-performance computing resources.

Complete Genomics, Inc File Access API

Welcome to the Complete Genomics File Access API. This API is licensed in source form under the terms of the Apache License 2.0. To review the full terms of this license please read the contents of the LICENSE.txt file found within the software distribution which contains this file.

This software distribution is intended to simplify access to genomic sequence data stored in the CGI file formats. This API supports access to several types of data, including: - Reads - called bases and quality scores - Empirical estimates - gap distribution and score-to-accuracy - Alignment - mapping to a reference sequence - Variations - called consensus variations Additional information on these file types can be found in the doc/DataFileFormats-API.pdf file included in this distribution.

This distribution also includes example programs which can be used to test data sets.

Instructions for compiling and using this software can be found in the INSTALL.txt files delivered as part of this software distribution.

Bug reports, source contributions or modifications, questions, or other correspondence should be sent to dev-support@completegenomics.com

While CGI offers this distribution on an as-is basis, we are grateful for suggestions as to how we can improve it.

Version Notes

Version 1.2.1
15-Jul-2009
New Features/Changes
The documentation was updated and a new example program was provided. The documentation contains a tutorial on programming with the API and contains additional instructions on successful installation of the API. The new example program creates fastQ output from the raw reads. There are no other changes to the API.
Version 1.2.0
15-Jun-2009
New Features/Changes
1) Support for sequence-dependent gap distribution. New API in the CompleteGenomics::Library::SmallGapDistribution class: double getDensity(const GapTuple& gaps, const SequenceRetriever& sr) const; ../src/common/DnbLimits.hpp ../src/libdnbcollection/GapsEstimator.cpp (new) ../src/libdnbcollection/GapsEstimator.hpp (new) 2) Support for the new score vs discorance estimates in ../src/libdnbcollection/DnbCollection.cpp ../src/libdnbcollection/DnbCollection.hpp ../src/libmapper/MappingResult.cpp ../src/libmapper/MappingResult.hpp 3) Support for LFR configurations ../src/libdnbcollection/TagMapper.cpp ../src/libdnbcollection/TagMapper.hpp ../src/libdnbcollection/Collection.cxx ../src/libdnbcollection/Collection.hxx ../src/libdnbcollection/Collection.xsd 4) Formatting and convience changes (extra operators for Location, Range and others ../src/ReferenceGenome.hpp (Vitali) ../src/common/ErrorHandling.cpp ../src/common/ErrorHandling.hpp 5) Minor changes to error handling classes ../src/libdnbcollection/CollectionUtilities.cpp ../src/libdnbcollection/CollectionUtilities.hpp
Known Issues
1) This version of the API is NOT compatible with the data produced before version 1.2.0 (version 1.1.0 and 1.0.1). To analyze data produced before Jun 15th, 2009 use the API provided with the data.
Version 1.1.0
5-Mar-2009
New Features
1) Support for degenerate base pair codes (ambiguity codes) in the reference genome.
Known Issues
1) This version of the API is NOT compatible with the reference genome data (GBI) files provided before version 1.1.0 (version 1.0.1).