Re: [EMBOSS] Transeq question, frame phases

2011-02-17 Thread Peter
On Wed, Feb 16, 2011 at 8:54 PM, David Mathog mat...@caltech.edu wrote:
 Test case fasta file
8Achars
 

 all 6 frames for transeq, standard mode emits:
_1
 KKX
_2
 KKX
_3
 KK
_4
 FF
_5
 FFX
_6
 FFX


Note you can do that with a single command line:

$ transeq asis: -filter -frame 6
asis_1
KKX
asis_2
KKX
asis_3
KK
asis_4
FF
asis_5
FFX
asis_6
FFX

Note that while using 1, 2, 3 for the forward frames is well defined, there
are two conventions for the reverse frame - do you start from the left or
the right?

First let's just do the forward frames,

$ transeq asis: -filter -frame 1
asis_1
KKX
$ transeq asis: -filter -frame 2
asis_2
KKX
$ transeq asis: -filter -frame 3
asis_3
KK

Are you happy with them?

Now let's do that with the reverse complement strand:

$ transeq asis: -filter -frame 1
asis_1
FFX
$ transeq asis: -filter -frame 2
asis_2
FFX
$ transeq asis: -filter -frame 3
asis_3
FF

Now let's do that with the original sequence but the negative frames:

$ transeq asis: -filter -frame -3
asis_6
FFX
$ transeq asis: -filter -frame -2
asis_5
FFX
$ transeq asis: -filter -frame -1
asis_4
FF

Same results - perhaps the naming isn't as you expected?

Peter
___
EMBOSS mailing list
EMBOSS@lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/emboss


Re: [EMBOSS] Transeq question, frame phases

2011-02-16 Thread David Mathog
Here is another worked example with a small but real mRNA fragment.
(Best cut and paste it into a program with a fixed width font).

Test sequence:

for  (AKA gi|1728|emb|V00893.1, this is + direction)
TCGCCGGGCCATGAAGGATGAGGAGAAGATGGAGCTGCA
GGAGATGCAGCTGAAGGAGGCCAAGCACATTGCCGAGGACTCA
GACCGCAAATACGAGGAGGTGGCCAGGAAGCTGGTGATCCTCGA

rev (for reversed)
TCGAGGATCACCAGCTTCCTGGCCACCTCCTCGTATTTGCGGT
CTGAGTCCTCGGCAATGTGCTTGGCCTCCTTCAGCTGCATCTC
CTGCAGCTCCATCTTCTCCTCATCCTTCATGGCCCGGCGA

Transeq output, all 6 frames, for for and rev
for_1
SKTGP*RMRRRWSCRRCS*RRPSTLPRTQTANTRRWPGSW*SSX
for_2
RKPGHEG*GEDGAAGDAAEGGQAHCRGLRPQIRGGGQEAGDPR
for_3
ENRAMKDEEKMELQEMQLKEAKHIAEDSDRKYEEVARKLVILX
for_4
RGSPASWPPPRICGLSPRQCAWPPSAASPAAPSSPHPSWPGFR
for_5
SRITSFLATSSYLRSESSAMCLASFSCISCSSIFSSSFMARFSX
for_6
EDHQLPGHLLVFAV*VLGNVLGLLQLHLLQLHLLLILHGPVFX
rev_1
SRITSFLATSSYLRSESSAMCLASFSCISCSSIFSSSFMARFSX
rev_2
RGSPASWPPPRICGLSPRQCAWPPSAASPAAPSSPHPSWPGFR
rev_3
EDHQLPGHLLVFAV*VLGNVLGLLQLHLLQLHLLLILHGPVFX
rev_4
RKPGHEG*GEDGAAGDAAEGGQAHCRGLRPQIRGGGQEAGDPR
rev_5
SKTGP*RMRRRWSCRRCS*RRPSTLPRTQTANTRRWPGSW*SSX
rev_6
ENRAMKDEEKMELQEMQLKEAKHIAEDSDRKYEEVARKLVILX

Output from a different program, all 12 frame options
shown on the fasta header line as: 

  phase(strand)

Positive phases are measured from sequence position 1. 
Negative phases measured from sequence position
N, the last base in the sequence. 
This program differs from transeq in that any
partial codon is emitted as an X.  Note how
transeq output never starts with an X, whereas
here the X maintains its position on the
Nucleic acid sequence, for instance, +1(+) and +1(-).

gi|1728|emb|V00893.1|[+1(+)] 
SKTGP*RMRRRWSCRRCS*RRPSTLPRTQTANTRRWPGSW*SSX
gi|1728|emb|V00893.1|[+2(+)] 
RKPGHEG*GEDGAAGDAAEGGQAHCRGLRPQIRGGGQEAGDPR
gi|1728|emb|V00893.1|[+3(+)] 
ENRAMKDEEKMELQEMQLKEAKHIAEDSDRKYEEVARKLVILX
gi|1728|emb|V00893.1|[+1(-)] 
XRGSPASWPPPRICGLSPRQCAWPPSAASPAAPSSPHPSWPGFR
gi|1728|emb|V00893.1|[+2(-)] 
SRITSFLATSSYLRSESSAMCLASFSCISCSSIFSSSFMARFS
gi|1728|emb|V00893.1|[+3(-)] 
XEDHQLPGHLLVFAV*VLGNVLGLLQLHLLQLHLLLILHGPVF
gi|1728|emb|V00893.1|[-1(-)] 
SRITSFLATSSYLRSESSAMCLASFSCISCSSIFSSSFMARFSX
gi|1728|emb|V00893.1|[-2(-)] 
RGSPASWPPPRICGLSPRQCAWPPSAASPAAPSSPHPSWPGFR
gi|1728|emb|V00893.1|[-3(-)] 
EDHQLPGHLLVFAV*VLGNVLGLLQLHLLQLHLLLILHGPVFX
gi|1728|emb|V00893.1|[-1(+)] 
XRKPGHEG*GEDGAAGDAAEGGQAHCRGLRPQIRGGGQEAGDPR
gi|1728|emb|V00893.1|[-2(+)] 
SKTGP*RMRRRWSCRRCS*RRPSTLPRTQTANTRRWPGSW*SS
gi|1728|emb|V00893.1|[-3(+)] 
XENRAMKDEEKMELQEMQLKEAKHIAEDSDRKYEEVARKLVIL
gi|1728|emb|V00893.1| 

Regards,

David Mathog
mat...@caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech
___
EMBOSS mailing list
EMBOSS@lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/emboss