-----Original Message----- From: ANJAN PURKAYASTHA [mailto:[EMAIL PROTECTED] Sent: Tuesday, 23 September 2008 11:22 AM To: beginners@perl.org Subject: pattern matching question
here is my problem: i have to check the entries of a column and write them out to a file if they happen to be DNA sequences ie they are exclusively composed of the letters A, T, G, C- no spaces or digits. the column also happens to have other strings that are made of word/digit/space characters. i tried if($x=~ /[ATGC]/ )then ..... .... Hello Anjan, This will be my first email to beginners@perl.org, so just incase I didn't follow the correct standards for posting, please forgive me. Your regular expression will match any column that has one A T G or C. So in this sense a column with ABBBB (which isn't DNA sequence) will return true for your "if" statement. What you want is something like this: if ($x =~ /^[ATGC]+$/i){ The ^ represents the start of a line, the $ represents the end of the line and a + represents a match of 1 or more times. You also might want to add a /i to ignore character case, so a column with atgc will also return true (which is common if the DNA is masked). Hope this helps. Dave -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/