On 03/21/2014 10:39 PM, Cameron Simpson wrote:
On 21Mar2014 20:31, Mustafa Musameh <jmm...@yahoo.com> wrote:
Please help. I have been search the internet to understand how to write a 
simple program/script with python, and I did not do anything.
I have a file that look like this
ID 1
agtcgtacgt…
ID 2
attttaaaaggggcccttcc
.
.
.
in other words, it contains several IDs each one has a sequence of 'acgt' 
letters
I need to write a script in python where the output will be, for example, like 
this
ID 1
a = 10%, c = 40%,  g=40%, t = 10%
ID 2
a = 15%, c = 35%,  g=35%, t = 15%
.
.
.
(i mean the first line is the ID and the second line is the frequency of each 
letter )
How I can tell python to print the first line as it is and count characters 
starting from the second line till the beginning of the next '>' and so on

You want a loop that reads lines in pairs. Example:

   while True:
     line1 = fp.readline()
     print line1,
     line2 = fp.readline()
     ... process the line and report ...

Then to process the line, iterate over the line. Because a line is
string, and a string is a sequence of characters, you can write:

   for c in line2:
     ... collect statistics about c ...
   ... print report ...

I would collect the statistics using a dictionary to keep count of
the characters. See the dict.setdefault method; it should be helpful.

I think it would be easier to do that in 2 loops:
* first read the file line by line, building a list of pairs (id, base-sequence)
  (and on the fly check the input is correct, if needed)
* then traverse the sequences of bases to get numbers & percentages, and write 
out

d

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Reply via email to