Hi,
I have some questions that I am unable to figure out.
Let say I have a file name peaks.txt.
Chr1 7 9 4.5 5.5
chr10 6 9 3.5 4.5
chr1 10 6 2.5 4.4
Question is how can i sort the file so that it looks like this:
Chr1 7 9 4.5 5.5
chr1 10 6 2.5 4.4
chr10 6 9 3.5 4.5
Next is how do I extract out the p-values(those highlighted in red)
After I extracted out all the p-values. for example all the p-values from chr1
is 6,7,9,10 and for chr10 are 6 and 9.
So for example if the p-value is 7 from chr1, i would open out a file called
chr1.fa which look like this:
>chr1
ATTGTACT
ATTTGTAT
ATTCGTCA
and I will extract out the subsequence TACTA. Basically p-value(in this case
its 7) position counting from second line of the chr1.fa file and print out the
subsequence from starting from position 7-d and 7+d, where d=2. Thus if the
p-values is taken from chr10 then we read from the a file with file name
chr10.fa which can look like like:
chr10
TTAGTACT
GTACTAGT
ACGTATTT
So the question is how do I do this for all the p-values.(i.e all the p-values
from chr1 and all the p-values from chr10) if let say we dont know peaks.txt
files have how many lines.
And how do i output it to a file such that it will have the following format:
Chr1
peak value 6: TTGTA
peak value 7: TACTA
etc etc for all the p-values of chr1
chr10
peak value 7: TTACT
etc etc etc...
thanks for the help,
Angeline
_________________________________________________________________
Hotmail: Trusted email with Microsoft’s powerful SPAM protection.
https://signup.live.com/signup.aspx?id=60969
_______________________________________________
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor