[Tutor] looping problem
hi, the reason could be that I did not quite understand the concept of looping I have a list of 48 elements I want to create another two lists , listA and listB I want to loop through the list with 48 elements and select element with index 0,3,6,9,12 ..etc into listA select elements with index 2,5,8,11 etc into listB. Could any one help me how can I do that thankyou __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] looping problem
keep a counter in your loop. is this a homework question? On Sep 23, 2006, at 8:34 AM, kumar s wrote: hi, the reason could be that I did not quite understand the concept of looping I have a list of 48 elements I want to create another two lists , listA and listB I want to loop through the list with 48 elements and select element with index 0,3,6,9,12 ..etc into listA select elements with index 2,5,8,11 etc into listB. Could any one help me how can I do that thankyou __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] looping problem
hi, thank you. this is not a homework question. I have a very huge file of fasta sequence. GeneName \t AATTAAGGAA.. (1000 lines) AATAAGGA GeneName \t GGAGAGAGATTAAGAA (15000 lines) when I read this as: f2= open('myfile','r') dat = f2.read().split('\n') turned out to be very expensive deal on computer. Instead I tried this: dat = f2.read() (reading into jumbo file of 19,100,442,1342 lines is easy but getting into what i want is a problem). I want to create a dictionary where 'GeneName' as key and sequence of ATGC characters as value biglist = dat.split('\t') ['GeneName ','','ATTAAGGCCAA'...] Now I want to select ''GeneName ' into listA and 'ATTAAGGCCAA' into listB so I want to select 0,3,6,9 elements into listA and 2,5,8,11 and so on elements into listB then I can do dict(zip(listA,listB)) however, the very loops concept is getting blanked out in my brain when I want to do this: for j in range(len(biglist)): from here .. I cannot think anything.. may be it is just mental block.. thats the reason I seek help on forum. Thanks --- jim stockford [EMAIL PROTECTED] wrote: keep a counter in your loop. is this a homework question? On Sep 23, 2006, at 8:34 AM, kumar s wrote: hi, the reason could be that I did not quite understand the concept of looping I have a list of 48 elements I want to create another two lists , listA and listB I want to loop through the list with 48 elements and select element with index 0,3,6,9,12 ..etc into listA select elements with index 2,5,8,11 etc into listB. Could any one help me how can I do that thankyou __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] looping problem
kumar s wrote: [snip] so I want to select 0,3,6,9 elements into listA and 2,5,8,11 and so on elements into listB Here's a hint: for j in range(0, len(biglist), 3): # this will set j = 0, 3, 6, etc. -- Bob Gailer 510-978-4454 ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] looping problem
On Sat, 2006-09-23 at 09:03 -0700, kumar s wrote: hi, thank you. this is not a homework question. I have a very huge file of fasta sequence. GeneName \t AATTAAGGAA.. (1000 lines) AATAAGGA GeneName \t GGAGAGAGATTAAGAA (15000 lines) when I read this as: f2= open('myfile','r') dat = f2.read().split('\n') turned out to be very expensive deal on computer. Instead I tried this: dat = f2.read() (reading into jumbo file of 19,100,442,1342 lines is easy but getting into what i want is a problem). I want to create a dictionary where 'GeneName' as key and sequence of ATGC characters as value biglist = dat.split('\t') ['GeneName ','','ATTAAGGCCAA'...] Now I want to select ''GeneName ' into listA and 'ATTAAGGCCAA' into listB so I want to select 0,3,6,9 elements into listA and 2,5,8,11 and so on elements into listB then I can do dict(zip(listA,listB)) however, the very loops concept is getting blanked out in my brain when I want to do this: for j in range(len(biglist)): from here .. I cannot think anything.. slices may be the best way to go listA = biglist[0::3] # start from index 0 taking every third element listB = biglist[2::3] # start from index 2 taking every third element may be it is just mental block.. thats the reason I seek help on forum. Thanks --- jim stockford [EMAIL PROTECTED] wrote: keep a counter in your loop. is this a homework question? On Sep 23, 2006, at 8:34 AM, kumar s wrote: hi, the reason could be that I did not quite understand the concept of looping I have a list of 48 elements I want to create another two lists , listA and listB I want to loop through the list with 48 elements and select element with index 0,3,6,9,12 ..etc into listA select elements with index 2,5,8,11 etc into listB. Could any one help me how can I do that thankyou __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor -- Lloyd Kvam Venix Corp ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] looping problem
#!/usr/bin/python # or whatever is the absolute path to python on your system counter = 0 for i in a,b,c,d,e,f,g : if counter%3 == 0 : print i + list one , counter, counter%3 if counter%3 == 1 : print i + list two , counter, counter%3 if counter%3 == 2 : print i + not used , counter, counter%3 print done On Sep 23, 2006, at 9:03 AM, kumar s wrote: hi, thank you. this is not a homework question. I have a very huge file of fasta sequence. GeneName \t AATTAAGGAA.. (1000 lines) AATAAGGA GeneName \t GGAGAGAGATTAAGAA (15000 lines) when I read this as: f2= open('myfile','r') dat = f2.read().split('\n') turned out to be very expensive deal on computer. Instead I tried this: dat = f2.read() (reading into jumbo file of 19,100,442,1342 lines is easy but getting into what i want is a problem). I want to create a dictionary where 'GeneName' as key and sequence of ATGC characters as value biglist = dat.split('\t') ['GeneName ','','ATTAAGGCCAA'...] Now I want to select ''GeneName ' into listA and 'ATTAAGGCCAA' into listB so I want to select 0,3,6,9 elements into listA and 2,5,8,11 and so on elements into listB then I can do dict(zip(listA,listB)) however, the very loops concept is getting blanked out in my brain when I want to do this: for j in range(len(biglist)): from here .. I cannot think anything.. may be it is just mental block.. thats the reason I seek help on forum. Thanks --- jim stockford [EMAIL PROTECTED] wrote: keep a counter in your loop. is this a homework question? On Sep 23, 2006, at 8:34 AM, kumar s wrote: hi, the reason could be that I did not quite understand the concept of looping I have a list of 48 elements I want to create another two lists , listA and listB I want to loop through the list with 48 elements and select element with index 0,3,6,9,12 ..etc into listA select elements with index 2,5,8,11 etc into listB. Could any one help me how can I do that thankyou __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] looping problem
kumar s wrote: hi, thank you. this is not a homework question. I have a very huge file of fasta sequence. I want to create a dictionary where 'GeneName' as key and sequence of ATGC characters as value biglist = dat.split('\t') ['GeneName ','','ATTAAGGCCAA'...] Now I want to select ''GeneName ' into listA and 'ATTAAGGCCAA' into listB so I want to select 0,3,6,9 elements into listA and 2,5,8,11 and so on elements into listB then I can do dict(zip(listA,listB)) however, the very loops concept is getting blanked out in my brain when I want to do this: for j in range(len(biglist)): from here .. I cannot think anything.. may be it is just mental block.. thats the reason I seek help on forum. Lloyd has pointed you to slicing as the answer to your immediate question. However for the larger question of reading FASTA files, you might want to look at CoreBio, this is a new library of Python modules for computational biology that looks pretty good. http://code.google.com/p/corebio/ CoreBio has built-in support for reading FASTA files into Seq objects. For example: In [1]: import corebio.seq_io In [2]: f=open(r'F:\Bio\BIOE48~1\KENTJO~1\SEQUEN~2\fasta\GI5082~1.FAS') In [3]: seqs = corebio.seq_io.read(f) seqs is now a list of Seq objects for each sequence in the original file In this case there is only one sequence but it will work for your file also. In [4]: for seq in seqs: ...: print seq.name ...: print seq ...: ...: gi|50826|emb|CAA28242.1| MIRTLLLSALVAGALSCGYPTYEVEDDVSRVVGGQEATPNTWPWQVSLQVLSSGRWRHNCGGSLVANNWVLTAAHCLSNYQTYRVLLGAHSLSNPGAGSAAVQVSKLVVHQRWNSQNVGNGYDIALIKLASPVTLSKNIQTACLPPAGTI LPRNYVCYVTGWGLLQTNGNSPDTLRQGRLLVVDYATCSSASWWGSSVKSSMVCAGGDGVTSSCNGDSGGPLNCRASNGQWQVHGIVSFGSSLGCNYPRKPSVFTRVSNYIDWINSVMARN In your case, you want a dict whose keys are the sequence name up to the first tab, and the values are the actual sequences. Something like this should work: d = dict( (seq.name.split('\t')[0], seq) for seq in seqs) The Seq class is a string subclass so putting the seq in the dict is what you want. There is also an iterator to read sequences one at a time, this might be a little faster and more memory efficient because it doesn't have to create the big list of all sequences. Something like this (untested): from corebio.seq_io.fasta_io import iterseq f = open(...) d = dict( (seq.name.split('\t')[0], seq) for seq in iterseq(f)) Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] looping problem
On 24/09/06, Python [EMAIL PROTECTED] wrote: slices may be the best way to go listA = biglist[0::3] # start from index 0 taking every third element listB = biglist[2::3] # start from index 2 taking every third element I'm not certain they would be.. If you do that, you will: 1. Create a really big list. 2. Go through the list, taking every third element. 3. Go through the list again, taking every third+2 element. If the list is really big, step 1. might take some time and/or space, and you would like to avoid it. If we have: f2= open('myfile','r') listA = [] listB = [] then we can iterate through f2 as follows: for i, line in enumerate(f2): if i % 3 == 0 then listA.append(line) elif i % 3 == 2 then listB.append(line) This may be faster.. (although I should like to see evidence before committing to that statement :-) ) -- John. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] looping problem
John Fouhy wrote: On 24/09/06, Python [EMAIL PROTECTED] wrote: slices may be the best way to go listA = biglist[0::3] # start from index 0 taking every third element listB = biglist[2::3] # start from index 2 taking every third element I'm not certain they would be.. If you do that, you will: 1. Create a really big list. 2. Go through the list, taking every third element. 3. Go through the list again, taking every third+2 element. If the list is really big, step 1. might take some time and/or space, and you would like to avoid it. That's a good point, though the OP didn't seem to have a problem with memory. If we have: f2= open('myfile','r') listA = [] listB = [] then we can iterate through f2 as follows: for i, line in enumerate(f2): if i % 3 == 0 then listA.append(line) elif i % 3 == 2 then listB.append(line) This may be faster.. (although I should like to see evidence before committing to that statement :-) ) Since the end goal seems to be to create a dictionary, there is really no need to create the intermediate lists at all. You could do something like this (following your file example): d = {} while 1: try: key, _, value = f2.next(), f2.next(), f2.next() d[key] = value except StopIteration: pass To do this with a list instead of a file use f2=iter(reallyBigList). Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor