SYNBIOCHEM PIPELINE - DESIGN
The design platform is responsible for the modelling and design of enzyme and regulatory elements, pathway inference and genome-scale modeling. Design implies part/pathway optimization and includes iterative learning.
Enzyme and regulatory element selection:
For the in silico design of our synthetic parts we have demonstrated strength in prediction of pathway enzymes (Bioparts mining), sequence feature and reaction rules encoding, 3D modelling and molecular dynamics. DNA part design: selection of DNA blocks, simultaneous codon optimisation and bespoke RBS design; directed evolution design for HTP combinatorial mutagenesis and parts selection for compatibility with multiple assembly methods. we are expanding our existing GeneGenie software for oligomer design into a comprehensive workbench for the design and directed evolution and optimization of enzymes, regulatory elements and entire pathways. The diversity of activities targeted (including novel enzymes) requires accelerated strategies for intelligent Directed Evolution (MD; machine learning; explore large area of the protein sequence/structural search space; BioQSAR). We use machine learning to infer design rules rapidly from large sets of sequences, including the prediction and synthesis of high-performance variants.
Pathway design and selection:
Pathway modelling is applied to identify and characterise metabolic pathways leading to chemical diversity. The challenge is rapid identification of optimised relative activities in newly engineered pathways.Building on strengths in dynamic modelling (Copasi) and pathway modelling with incomplete kinetic data (Monte Carlo) and the use of transcriptomics to predict pathway flux (MCA: metabolic control analysis). We are building on our expertise on development and validation of protocols for metabolic pathway design (RetroPath) through the application of a retrosynthetic algorithm that considers putative routes on an extended metabolic space involving pathway enumeration and ranking.
To characterise the metabolic behaviour and capabilities of the proposed host organism we apply Systems modelling expanding our existing metabolic network reconstructions using a combination of transcriptomics, untargeted metabolomics and in silico methods. Development of genome-scale modelling, including flux, thermodynamic and transcriptome-based constraints will improve predictive accuracy beyond standard genome-scale models. We are also actively developing approaches for creating genome-scale kinetic models. The challenge is rapid identification of knock-out and overexpression targets for the generation of optimised chassis strains.
Used to explore the chemical biosynthetic space, RetroPath provides an automated open source workflow based on generalized reaction rules that performs retrosynthesis search from chassis to target through an efficient and well-controlled protocol. Publication: Delépine B, Duigou T, Carbonell P, Faulon JL. (2017). RetroPath2.0: A retrosynthesis workflow for metabolic engineers. Metabolic Engineering, 45, 158-70. Koch M, Duigou T, Carbonell P, Faulon JL. (2017). Molecular structures enumeration and virtual screening in the chemical space with RetroPath2.0. Journal of Cheminformatics, 9(1):64.
A web server tool (co-developed with Micalis/INRA) to identify putative biochemical transformations of target compounds and allow identification of easily detectable compounds for screening and identification of potential biosensors. Publication: Delepine B, Libis V, Carbonell P, Faulon JL. (2016). SensiPath: computer-aided design of sensing-enabling metabolic pathways. Nucleic acids research, 44: W226-231.
An on line enzyme selection tool for target reactions or set of reactions for metabolic pathway design that allows us to mine candidate enzyme sequences for any desired target reaction or set of reactions in a pathway and the search for alternative routes or pathways leading to non-natural products. Selenzyme uses our biochem4j graph database as its main data source and provides bespoke sequence selection for automated workflows. Publication: Carbonell P, Wong J, Swainston N, Takano E, Turner NJ, Scrutton NS, Kell DB, Breitling R, Faulon JL. Selenzyme: Enzyme selection tool for pathway design. Bioinformatics. 34(12), 2153-2154.
A freely available graph database of integrated chemical, reaction, enzyme and taxonomic data, based on the neo4j database. Biochem4j has applications in pathway elucidation, enzyme selection and metabolic modelling, acting as a knowledge base which brings together distributed data into an integrated and queryable system. It currently contains relevant information of known relationships between reactions (36765), chemicals (19735), enzymes (245704) and organisms (8431). Publication: Swainston N, Batista-Navarro R, Carbonell P, Dobson PD, Dunstan M, Jervis AJ, Vinaixa M, Williams AR, Ananiadou S, Faulon JL, Mendes P, Kell DB, Scrutton NS, Breitling R. (2017). biochem4j: Integrated and extensible biochemical knowledge through graph databases. PloS one, 12, e0179130.
A web application for the design of reusable synthetic biology parts that offers simultaneous codon optimisation, RBS design, CDS RBS removal, assembly and synthesis compatibility. It is designed to bridge the gap between optimisation tools for the design of novel parts, the representation of such designs in community-developed data standards such as Synthetic Biology Open Language (SBOL), and the sharing of designs in journal-recommended data repositories (JBEI-ICE). It facilitates the design, optimisation and dissemination of reusable synthetic biology parts through a single, integrated application. PartsGenie has been used for the design of most synthetic DNA used in the SYNBIOCHEM Centre and elsewhere in the MIB and is now freely available to the wider synthetic biology community. Publication: Swainston N, Dunstan M, Jervis AJ, Robinson CJ, Carbonell P, Williams AR, Faulon JL, Scrutton NS, Kell DB. (2018). PartsGenie: an integrated tool for optimizing and sharing synthetic biology parts. Bioinformatics. 2018 Jul 1;34(13):2327-2329.
Gene synthesis methods for efficient for variant library design and directed evolution projects. Publication: Swainston N, Currin A, Day PJ and Kell DB. (2014). GeneGenie: optimised oligomer design for directed evolution. Nucleic Acids Res 42 (W1): W395-W400. Currin A, Swainston N, Day PJ, Kell DB. (2017). SpeedyGenes: Exploiting an Improved Gene Synthesis Method for the Efficient Production of Synthetic Protein Libraries for Directed Evolution. Methods in Molecular Biology, 1472: 63-78.
Application for designing ambiguous codons to support protein mutagenesis. Given a user-defined target collection of amino acids and an intended host organism, CodonGenie will design and analyse all ambiguous codons that encode the required amino acids. The codons are ranked according to their efficiency in encoding the required amino acids while minimising the encoding of additional amino acids and stop codons. Publication: Swainston N, Currin A, Green L, Breitling R, Day PJ, Kell DB. (2017). CodonGenie: optimised ambiguous codon design tools. PeerJ Computer Science, 7, e120. DOI: 10.7717/peerj-cs.120.