-----Original Message-----
From: ANJAN PURKAYASTHA [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, 23 September 2008 11:22 AM
To: beginners@perl.org
Subject: pattern matching question

here is my problem:
i have to check the entries of a column and write them out to a file if they
happen to be DNA sequences ie they are exclusively composed of the letters
A, T, G, C- no spaces or digits.
the column also happens to have other strings that are made of
word/digit/space characters.
i tried
if($x=~ /[ATGC]/ )then .....
....

Hello Anjan,

This will be my first email to beginners@perl.org, so just incase I didn't
follow the correct standards for posting, please forgive me.

Your regular expression will match any column that has one A T G or C. So in
this sense a column with ABBBB (which isn't DNA sequence) will return true
for your "if" statement.

What you want is something like this: if ($x =~ /^[ATGC]+$/i){

The ^ represents the start of a line, the $ represents the end of the line
and a + represents a match of 1 or more times. You also might want to add a
/i to ignore character case, so a column with atgc will also return true
(which is common if the DNA is masked).

Hope this helps.

Dave



-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/


Reply via email to