On Tue, Dec 11, 2012 at 10:54 AM, Hs Hs <ilhs...@yahoo.com> wrote: > Dear group: >
Please send mail as plain text. It is easier to read > > I have 50 thousand lists. My aim is to search a pattern in the > alphabetical strings (these are protein sequence strings). > > > MMSASRLAGTLIPAMAFLSCVRPESWEPC VEVVP > NITYQCMELNFYKIPDNLPFSTKNLDLSFNPLRHLGSYSFFSFPELQVLDLSRCEIQTIED > > my aim is to find the list of string that has V*VVP. > Asterisk The "*" matches 0 or more instances of the previous element. I am not sure what you want, but I don't think it is this. Do you want V then any characters followed by VVP? In that case perhaps V.+VP There are many tutorials about how to create regular expressions ** ** > > myseq = 'MMSASRLAGTLIPAMAFLSCVRPESWEPC VEVVP > NITYQCMELNFYKIPDNLPFSTKNLDLSFNPLRHLGSYSFFSFPELQVLDLSRCEIQTIED' > > if re.search('V*VVP',myseq): > print myseq > > the problem with this is, I am also finding junk with just VVP or VP etc. > > How can I strictly search for V*VVP only. > > Thanks for help. > > Hs > > _______________________________________________ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor > > -- Joel Goldstick
_______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor