On 7 Dez 2004, [EMAIL PROTECTED] wrote:
> I have two lists names x and seq.
>
> I am trying to find element of x in element of seq. I
> find them. However, I want to print element in seq
> that contains element of x and also the next element
> in seq.
[...]
> 3. TRIAL 3:
> I just asked to print the element in seq that matched
> element 1 in X. It prints only that element, however
> I want to print the next element too and I cannot get
> it.
>>>> for ele1 in x:
> for ele2 in seq:
> if ele1 in ele2:
> print ele2
>
[...]
>>>> len(x)
> 4504
>>>> x[1:10]
> ['454:494', '319:607', '319:608', '322:289',
> '322:290', '183:330', '183:329', '364:95', '364:96']
>>>> len(seq)
> 398169
>>>> seq[0:4]
> ['>probe:HG-U95Av2:1000_at:399:559;
> Interrogation_Position=1367; Antisense;',
> 'TCTCCTTTGCTGAGGCCTCCAGCTT',
> '>probe:HG-U95Av2:1000_at:544:185;
> Interrogation_Position=1379; Antisense;',
> 'AGGCCTCCAGCTTCAGGCAGGCCAA']
[...]
> How Do I WANT:
>
> I want to print get an output like this:
>
>
>>probe:HG-U95Av2:1000_at:399:559;
> Interrogation_Position=1367; Antisense;'
> TCTCCTTTGCTGAGGCCTCCAGCTT
>
>>probe:HG-U95Av2:1000_at:544:185;
> Interrogation_Position=1379; Antisense;
> AGGCCTCCAGCTTCAGGCAGGCCAA
Hi, you got some replies how to do it, but IMO there are two other
possibilities:
(a) Turn seq into a dictionary with the parts of the string that are
matched against from list x as keys. Since seq is long that may be
much faster.
def list_to_dict (lst):
d = {}
reg = re.compile(':.+?:.+?:(.+?:.+?);')
for val1, val2 in lst:
key = reg.search(val1).group(1)
d[key] = val1 + val2
return d
import re
seq_dic = list_to_dict(zip(seq[::2], seq[1::2]))
for key in x:
val = seq_dic.get(key)
if val: print val
The above function uses a regular expression to extract the part of
the string you are interested in and uses it as key in a dictionary.
To find the corrresponding list entries `zip(seq[::2], seq[1::2])'
is used; seq[::2] is the first, the third, the fifth ... entry of
the list and seq[1::2] is the second, the fourth, the sixth entry of
the list. zip() packs them together in a tuple.
(b) If you care about memory iterate about seq with izip (from
itertools).
from itertools import izip as izip
reg = re.compile(':.+?:.+?:(.+?:.+?);')
for val1, val2 in izip(seq[::2], seq[1::2]):
if reg.search(val1).group(1) in x:
print val1, val2
Instead of zip() izip() is here used (it does not create the whole
list at once). Al�so no dictionary is used. What's better for you
shows only testing.
Karl
--
Please do *not* send copies of replies to me.
I read the list
_______________________________________________
Tutor maillist - [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor