On Tue, Dec 11, 2012 at 10:54 AM, Hs Hs <[email protected]> wrote:
> Dear group:
>
Please send mail as plain text. It is easier to read
>
> I have 50 thousand lists. My aim is to search a pattern in the
> alphabetical strings (these are protein sequence strings).
>
>
> MMSASRLAGTLIPAMAFLSCVRPESWEPC VEVVP
> NITYQCMELNFYKIPDNLPFSTKNLDLSFNPLRHLGSYSFFSFPELQVLDLSRCEIQTIED
>
> my aim is to find the list of string that has V*VVP.
>
Asterisk
The "*" matches 0 or more instances of the previous element.
I am not sure what you want, but I don't think it is this. Do you want V
then any characters followed by VVP? In that case perhaps
V.+VP
There are many tutorials about how to create regular expressions
**
**
>
> myseq = 'MMSASRLAGTLIPAMAFLSCVRPESWEPC VEVVP
> NITYQCMELNFYKIPDNLPFSTKNLDLSFNPLRHLGSYSFFSFPELQVLDLSRCEIQTIED'
>
> if re.search('V*VVP',myseq):
> print myseq
>
> the problem with this is, I am also finding junk with just VVP or VP etc.
>
> How can I strictly search for V*VVP only.
>
> Thanks for help.
>
> Hs
>
> _______________________________________________
> Tutor maillist - [email protected]
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
>
>
--
Joel Goldstick
_______________________________________________
Tutor maillist - [email protected]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor