Hi,

I have some questions that I am unable to figure out. 

Let say I have a file name peaks.txt.

Chr1    7       9          4.5         5.5
chr10   6       9          3.5         4.5
chr1     10     6          2.5         4.4

Question is how can i sort the file so that it looks like this:



Chr1    7       9          4.5         5.5
chr1     10     6          2.5         4.4
chr10   6       9          3.5         4.5

Next is how do I extract out the p-values(those highlighted in red)

After I extracted out all the p-values. for example all the p-values from chr1 
is 6,7,9,10 and for chr10 are 6 and 9.

So for example if the p-value is 7 from chr1, i would open out a file called 
chr1.fa which look like this:

>chr1
ATTGTACT
ATTTGTAT
ATTCGTCA

and I will extract out the subsequence TACTA. Basically p-value(in this case 
its 7) position counting from second line of the chr1.fa file and print out the 
subsequence from starting from position 7-d and 7+d, where d=2. Thus if the 
p-values is taken from chr10 then we read from the a file with file name 
chr10.fa which can look like like:

chr10
TTAGTACT
GTACTAGT
ACGTATTT

So the question is how do I do this for all the p-values.(i.e all the p-values 
from chr1 and all the p-values from chr10) if let say we dont know peaks.txt 
files have how many lines.

And how do i output it to a file such that it will have the following format:

Chr1

peak value 6: TTGTA

peak value 7: TACTA

etc etc for all the p-values of chr1

chr10

peak value 7: TTACT

etc etc etc...


thanks for the help,
Angeline














                                          
_________________________________________________________________
Hotmail: Trusted email with Microsoft’s powerful SPAM protection.
https://signup.live.com/signup.aspx?id=60969
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to