table of contents

Bionode-seq

A module for DNA, RNA and protein sequences manipulation.

Install and load

This is a very simple module and doesn't have a command line interface yet as it's mostly used in the browser. You can load and install it like this:

npm install bionode-seq
var seq = require('bionode-seq')

Usage

Check sequence type

Takes a sequence string and checks if it's DNA, RNA or protein (returns 'dna', 'rna', 'protein' or undefined). Other optional arguments include threshold, length and index (see below).

 seq.checkType("ATGACCCTGAGAAGAGCACCG");
 //=> "dna"
 seq.checkType("AUGACCCUGAAGGUGAAUGAA");
 //=> "rna"
 seq.checkType("MAYKSGKRPTFFEVFKAHCSDS");
 //=> "protein"
 seq.checkType("1234567891234567ATGACC");
 //=> undefined

By default, the method has a 90% threshold, however, this can be altered as required.

seq.checkType("1234567891234567ATGACC", 0.8);
=> undefined
seq.checkType("--------MAYKSGKRPTFFEV", 0.7);
=> "protein"

The length value specifies the length of the sequence to be analyse (default 10000). If your sequence is extremely long, you may want to analyse a shorter sub-section to reduce the computational burden.

seq.checkType('A Very Long Sequence', 0.9, 1000);
=> Type based on the first 1000 characters

The index value specifies the point on the sequence from which the sequence is to be analysed. Perhaps you know that there are lot of gaps at the start of the sequence.

seq.checkType("--------MAYKSGKRPTFFEV", 0.9, 10000, 8);
=> "protein"

Takes a sequence type argument and returns a function to complement bases.

Reverse sequence

Takes sequence string and returns the reverse sequence.

seq.reverse("ATGACCCTGAAGGTGAA");
=> "AAGTGGAAGTCCCAGTA"

(Reverse) complement sequence

Takes a sequence string and optional boolean for reverse, and returns its complement.

seq.complement("ATGACCCTGAAGGTGAA");
=> "TACTGGGACTTCCACTT"
seq.complement("ATGACCCTGAAGGTGAA", true);
=> "TTCACCTTCAGGGTCAT"
Alias
seq.reverseComplement("ATGACCCTGAAGGTGAA");
=> "TTCACCTTCAGGGTCAT"

Takes a sequence string and returns the reverse complement (syntax sugar).

Transcribe base

Takes a base character and returns the transcript base.

seq.getTranscribedBase("A");
=> "U"
seq.getTranscribedBase("T");
=> "A"
seq.getTranscribedBase("t");
=> "a"
seq.getTranscribedBase("C");
=> "G"

Get codon amino acid

Takes an RNA codon and returns the translated amino acid.

seq.getTranslatedAA("AUG");
=> "M"
seq.getTranslatedAA("GCU");
=> "A"
seq.getTranslatedAA("CUU");
=> "L"

Remove introns

Take a sequence and an array of exonsRanges and removes them.

seq.removeIntrons("ATGACCCTGAAGGTGAATGACAG", [[1, 8]]);
=> "TGACCCT"
seq.removeIntrons("ATGACCCTGAAGGTGAATGACAG", [[2, 9], [12, 20]]);
=> "GACCCTGGTGAATGA"

Transcribe sequence

Takes a sequence string and returns the transcribed sequence (dna <-> rna). If an array of exons is given, the introns will be removed from the sequence.

seq.transcribe("ATGACCCTGAAGGTGAA");
=> "AUGACCCUGAAGGUGAA"
seq.transcribe("AUGACCCUGAAGGUGAA"); reverse
=> "ATGACCCTGAAGGTGAA"

Translate sequence

Takes a DNA or RNA sequence and translates it to protein If an array of exons is given, the introns will be removed from the sequence.

seq.translate("ATGACCCTGAAGGTGAATGACAGGAAGCCCAAC"); dna
=> "MTLKVNDRKPN"
seq.translate("AUGACCCUGAAGGUGAAUGACAGGAAGCCCAAC"); rna
=> "MTLKVNDRKPN"
seq.translate("ATGACCCTGAAGGTGAATGACAGGAAGCC", [[3, 21]]);
=> "LKVND"

Reverse exons

Takes an array of exons and the length of the reference and returns inverted coordinates.

seq.reverseExons([[2,8]], 20);
=> [ [ 12, 18 ] ]
seq.reverseExons([[10,45], [65,105]], 180);
=> [ [ 135, 170 ], [ 75, 115 ] ]

Find non-canonical splice sites

Takes a sequence and exons ranges and returns an array of non canonical splice sites.

seq.findNonCanonicalSplices("GGCGGCGGCGGTGAGGTGGACCTGCGCGAATACGTGGTCGCCCTGT", [[0, 10], [20, 30]]);
=> [ 20 ]
seq.findNonCanonicalSplices("GGCGGCGGCGGTGAGGTGAGCCTGCGCGAATACGTGGTCGCCCTGT", [[0, 10], [20, 30]]);
=> []

Check canonical translation start site

Takes a sequence and returns boolean for canonical translation start site.

seq.checkCanonicalTranslationStartSite("ATGACCCTGAAGGT");
=> true
seq.checkCanonicalTranslationStartSite("AATGACCCTGAAGGT");
=> false

Get reading frames

Takes a sequence and returns an array with the six possible Reading Frames (+1, +2, +3, -1, -2, -3).

seq.getReadingFrames("ATGACCCTGAAGGTGAATGACAGGAAGCCCAAC");
=> [ 'ATGACCCTGAAGGTGAATGACAGGAAGCCCAAC',
'TGACCCTGAAGGTGAATGACAGGAAGCCCAAC',
'GACCCTGAAGGTGAATGACAGGAAGCCCAAC',
'GTTGGGCTTCCTGTCATTCACCTTCAGGGTCAT',
'TTGGGCTTCCTGTCATTCACCTTCAGGGTCAT',
'TGGGCTTCCTGTCATTCACCTTCAGGGTCAT' ]

Get open reading frames

Takes a Reading Frame sequence and returns an array of Open Reading Frames.

seq.getOpenReadingFrames("ATGACCCTGAAGGTGAATGACAGGAAGCCCAAC");
=> [ 'ATGACCCTGAAGGTGAATGACAGGAAGCCCAAC' ]
seq.getOpenReadingFrames("AUGACCCUGAAGGUGAAUGACAGGAAGCCCAAC");
=> [ 'AUGACCCUGAAGGUGAAUGACAGGAAGCCCAAC' ]
seq.getOpenReadingFrames("ATGAGAAGCCCAACATGAGGACTGA");
=> [ 'ATGAGAAGCCCAACATGA', 'GGACTGA' ]

Get all open reading frames

Takes a sequence and returns all Open Reading Frames in the six Reading Frames.

seq.getAllOpenReadingFrames("ATGACCCTGAAGGTGAATGACA");
     => [ [ 'ATGACCCTGAAGGTGAATGACA' ],
     [ 'TGA', 'CCCTGA', 'AGGTGA', 'ATGACA' ],
     [ 'GACCCTGAAGGTGAATGA', 'CA' ],
     [ 'TGTCATTCACCTTCAGGGTCAT' ],
     [ 'GTCATTCACCTTCAGGGTCAT' ],
     [ 'TCATTCACCTTCAGGGTCAT' ] ]

Find longest open reading frame

Takes a sequence and returns the longest ORF from all six reading frames and corresponding frame symbol (+1, +2, +3, -1, -2, -3). If a frame symbol is specified, only look for longest ORF on that frame. When sorting ORFs, if there's a tie, choose the one that starts with start codon Methionine. If there's still a tie, return one randomly.

     seq.findLongestOpenReadingFrame("ATGACCCTGAAGGTGAATGACA");
     => [ 'ATGACCCTGAAGGTGAATGACA', '+1' ]
seq.findLongestOpenReadingFrame("ATGACCCTGAAGGTGAATGACA", "-1");
=> "TGTCATTCACCTTCAGGGTCAT"