Re: [ccp4bb] sequence format conversion

2012-05-08 Thread William Kennedy
K-
One bioinformatics tool that converts nucleitude sequence formats across many 
such formats is Biology Workbench, out of UCSD

Dexter Kennedy, MD
Thumbed from my iPhone

On May 8, 2012, at 12:02 AM, K Singh  wrote:

> Dear All
> I was looking for a script or an informatics tool enabling me to
> change the sequence from FASTA format to something like following:
> 
>> FASTA FORMAT
> abcdefghijklmnopqrstuvwxyz
> 
> to
> 
>  1  abcde fghij
> 11  klmno pqrst
> 21  uvwxy z
> 
> 
> Many thanks in advance
> 
> Regards
> Kris


Re: [ccp4bb] sequence format conversion

2012-05-08 Thread Gerard DVD Kleywegt

A good tool should leave "b" as is: it is ASX (the standard ambiguity
code for ASP or ASN). "j", "o" and "u" are a different matter :-)


http://www.uniprot.org/manual/non_std

"Selenocyteine [sic!] and pyrrolysine are represented in the sequence using 
the one-letter codes U for selenocysteine and O for pyrrolysine"


--Gerard

**
   Gerard J. Kleywegt

  http://xray.bmc.uu.se/gerard   mailto:ger...@xray.bmc.uu.se
**
   The opinions in this message are fictional.  Any similarity
   to actual opinions, living or dead, is purely coincidental.
**
   Little known gastromathematical curiosity: let "z" be the
   radius and "a" the thickness of a pizza. Then the volume
of that pizza is equal to pi*z*z*a !
**


Re: [ccp4bb] sequence format conversion

2012-05-08 Thread Peter Keller
On Tue, 2012-05-08 at 09:22 +0100, Marko Hyvonen wrote:

> PS. fasta format needs ">" as a first line with (optional) description in 
> the input file. And not sure what amino acids "b" and "j" would get 
> converted to :-)

A good tool should leave "b" as is: it is ASX (the standard ambiguity
code for ASP or ASN). "j", "o" and "u" are a different matter :-)

Regards,
Peter.

-- 
Peter Keller Tel.: +44 (0)1223 353033
Global Phasing Ltd., Fax.: +44 (0)1223 366889
Sheraton House,
Castle Park,
Cambridge CB3 0AX
United Kingdom


Re: [ccp4bb] sequence format conversion

2012-05-08 Thread Manish Chandra Pathak


Guess, you are looking for ReadSeq; with GenBank|gb set as output format.


http://www.ebi.ac.uk/cgi-bin/readseq.cgi 


However, no idea how to get only 10 residues in a line, if you are specific. 


best
Manish





>
> From: K Singh 
>To: CCP4BB@JISCMAIL.AC.UK 
>Sent: Tuesday, May 8, 2012 12:32 PM
>Subject: [ccp4bb] sequence format conversion
> 
>Dear All
>I was looking for a script or an informatics tool enabling me to
>change the sequence from FASTA format to something like following:
>
>>FASTA FORMAT
>abcdefghijklmnopqrstuvwxyz
>
>to
>
>  1  abcde fghij
>11  klmno pqrst
>21  uvwxy z
>
>
>Many thanks in advance
>
>Regards
>Kris
>
>
>

Re: [ccp4bb] sequence format conversion

2012-05-08 Thread Tim Gruene
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Dear Kris,

except for formatting, something like this:
 sed -e "s/\(.\)\(.\)/ \1 \2\n/g" test.fasta | awk '{count =
(10*LN +1); print count, $0; ++LN}'

(all on one line)

should do the job.


Cheers, Tim

On 05/08/12 09:02, K Singh wrote:
> Dear All I was looking for a script or an informatics tool enabling
> me to change the sequence from FASTA format to something like
> following:
> 
>> FASTA FORMAT
> abcdefghijklmnopqrstuvwxyz
> 
> to
> 
> 1  abcde fghij 11  klmno pqrst 21  uvwxy z
> 
> 
> Many thanks in advance
> 
> Regards Kris
> 

- -- 
- --
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iD8DBQFPqNxkUxlJ7aRr7hoRAhgaAKDvcRqxAuHMC+Ek9LHzufVBEvIQZACgz17H
goSKt88kLCJX5GXcI5Sl6iE=
=HkFQ
-END PGP SIGNATURE-


Re: [ccp4bb] sequence format conversion

2012-05-08 Thread Marko Hyvonen

Surely a sequence analysis tools are the easiest way to do it.

I'd recommend EMBOSS (open source and runs nicely on most platforms - the 
"ccp4" of sequence analysis for me at least) 
http://emboss.sourceforge.net/


Seqret (SEQuence RETurn) program:

seqret -out test.seq -osformat gcg test.fasta

Marko

PS. fasta format needs ">" as a first line with (optional) description in 
the input file. And not sure what amino acids "b" and "j" would get 
converted to :-)


On Tue, 8 May 2012, Francois Berenger wrote:


More seriously, there is the babel command from Open Babel
in case the second format you show has a known name.

On 05/08/2012 04:46 PM, Francois Berenger wrote:

Hello,

The tool is called awk.
There is also another tool called Perl, but I won't recommend it.

Regards,
F.

On 05/08/2012 04:02 PM, K Singh wrote:

Dear All
I was looking for a script or an informatics tool enabling me to
change the sequence from FASTA format to something like following:


FASTA FORMAT

abcdefghijklmnopqrstuvwxyz

to

1 abcde fghij
11 klmno pqrst
21 uvwxy z


Many thanks in advance

Regards
Kris





 _

 Marko Hyvonen
 Department of Biochemistry, University of Cambridge
 ma...@cryst.bioc.cam.ac.uk
 http://www-cryst.bioc.cam.ac.uk/groups/hyvonen
 tel:+44-(0)1223-766 044
 mobile: +44-(0)7796-174 877
 fax:+44-(0)1223-766 002
 --


Re: [ccp4bb] sequence format conversion

2012-05-08 Thread Francois Berenger

More seriously, there is the babel command from Open Babel
in case the second format you show has a known name.

On 05/08/2012 04:46 PM, Francois Berenger wrote:

Hello,

The tool is called awk.
There is also another tool called Perl, but I won't recommend it.

Regards,
F.

On 05/08/2012 04:02 PM, K Singh wrote:

Dear All
I was looking for a script or an informatics tool enabling me to
change the sequence from FASTA format to something like following:


FASTA FORMAT

abcdefghijklmnopqrstuvwxyz

to

1 abcde fghij
11 klmno pqrst
21 uvwxy z


Many thanks in advance

Regards
Kris


Re: [ccp4bb] sequence format conversion

2012-05-08 Thread Francois Berenger

Hello,

The tool is called awk.
There is also another tool called Perl, but I won't recommend it.

Regards,
F.

On 05/08/2012 04:02 PM, K Singh wrote:

Dear All
I was looking for a script or an informatics tool enabling me to
change the sequence from FASTA format to something like following:


FASTA FORMAT

abcdefghijklmnopqrstuvwxyz

to

   1  abcde fghij
11  klmno pqrst
21  uvwxy z


Many thanks in advance

Regards
Kris