On Oct 4, Cedric said:
>cgttgctagctgctatcgatgtgctagtcgatgctagtgcatgcgtagtgcagtcatatgctaggcat
>
>I want to extract all the substrings beginning with tag and finishing with
>tag including substrings with same start point but different length like :
>
>tagctgctatcgatgtgctag
>tagctgctatcgatgtgctagtcgatgctag
>tagctgctatcgatgtgctagtcgatgctagtgcatgcgtag
One way of matching your sequences would be to use a regex with code
blocks embedded in it.
Here is my solution:
my $dna = ...;
my @matches;
$dna =~ m{
(?=
tag
(?:
.*? tag
# the substr(...) is there to avoid using $&
(?{ push @matches, substr($dna, $-[0], $+[0] - $-[0]) })
)+
)
(?!)
}x;
Now @matches holds all those tag...tag strings. I'll explain the regex if
people would like. ;)
--
Jeff "japhy" Pinyan [EMAIL PROTECTED] http://www.pobox.com/~japhy/
RPI Acacia brother #734 http://www.perlmonks.org/ http://www.cpan.org/
** Look for "Regular Expressions in Perl" published by Manning, in 2002 **
<stu> what does y/// stand for? <tenderpuss> why, yansliterate of course.
[ I'm looking for programming work. If you like my work, let me know. ]
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]