Dnia 06-08-2009 o 01:54:46 PeroMHC <macma...@gmail.com> wrote:

This snippet represents 3 individual DNA sequences. Each sequences is
identified by the line starting with >
The complete file has about 10 million individual sequences.

A simple enough problem, I want to read in this data, and cut out the
last 76 letters (nucleotides) from each individual sequence and send
them to a new txt file with a similar format.

If I understand correctly you want sth like this:


with open(path_to_the_input_file) as fasta:
    with open(path_to_the_input_file) as nucleotides:
        for seq in fasta:
            print >>nucleotides, '> foo bar length=76'
            print >>nucleotides, seq[-76]


Cheers,
*j

--
Jan Kaliszewski (zuo) <z...@chopin.edu.pl>
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to