Cedric wrote: > > First of all, sorry for my poor english. > I work with DNA sequence and want to extract relevant motives of course > using regular expression. > A concrete example will be better than a long description of my problem, so: > > Here is a string corresponding to my DNA strand : > > cgttgctagctgctatcgatgtgctagtcgatgctagtgcatgcgtagtgcagtcatatgctaggcat > > I want to extract all the substrings beginning with tag and finishing with > tag including substrings with same start point but different length like : > > tagctgctatcgatgtgctag > tagctgctatcgatgtgctagtcgatgctag > tagctgctatcgatgtgctagtcgatgctagtgcatgcgtag > > How to write a regular expression which will not increase the value pos() > after finding one match but keep the same value of pos() and extract all > sequences with different lengths ? > > To sum up, if I have this sequence : > > taggcgttatcgctagcgcatcgataggctactattcgtagcc > > My regular expression must extract : > > taggcgttatcgctag > taggcgttatcgctagcgcatcgatag > taggcgttatcgctagcgcatcgataggctactattcgtag > tagcgcatcgatag > tagcgcatcgataggctactattcgtag > taggctactattcgtag > > Thanks for any help ...
Here is a way to do it without using regular expressions: $ perl -le' $DNA = "cgttgctagctgctatcgatgtgctagtcgatgctagtgcatgcgtagtgcagtcatatgctaggcat"; $i = -1; push @tags, $i while ( $i = index $DNA, "tag", $i + 1 ) >= 0; die "tag not found\n" if @tags < 2; for $x ( 0 .. $#tags - 1 ) { for $y ( $x + 1 .. $#tags ) { print substr( $DNA, $tags[$x], $tags[$y] - $tags[$x] ), "tag"; } } ' tagctgctatcgatgtgctag tagctgctatcgatgtgctagtcgatgctag tagctgctatcgatgtgctagtcgatgctagtgcatgcgtag tagctgctatcgatgtgctagtcgatgctagtgcatgcgtagtgcagtcatatgctag tagtcgatgctag tagtcgatgctagtgcatgcgtag tagtcgatgctagtgcatgcgtagtgcagtcatatgctag tagtgcatgcgtag tagtgcatgcgtagtgcagtcatatgctag tagtgcagtcatatgctag John -- use Perl; program fulfillment -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]