Cedric wrote:
>
> First of all, sorry for my poor english.
> I work with DNA sequence and want to extract relevant motives of course
> using regular expression.
> A concrete example will be better than a long description of my problem, so:
>
> Here is a string corresponding to my DNA strand :
>
> cgttgctagctgctatcgatgtgctagtcgatgctagtgcatgcgtagtgcagtcatatgctaggcat
>
> I want to extract all the substrings beginning with tag and finishing with
> tag including substrings with same start point but different length like :
>
> tagctgctatcgatgtgctag
> tagctgctatcgatgtgctagtcgatgctag
> tagctgctatcgatgtgctagtcgatgctagtgcatgcgtag
>
> How to write a regular expression which will not increase the value pos()
> after finding one match but keep the same value of pos() and extract all
> sequences with different lengths ?
>
> To sum up, if I have this sequence :
>
> taggcgttatcgctagcgcatcgataggctactattcgtagcc
>
> My regular expression must extract :
>
> taggcgttatcgctag
> taggcgttatcgctagcgcatcgatag
> taggcgttatcgctagcgcatcgataggctactattcgtag
> tagcgcatcgatag
> tagcgcatcgataggctactattcgtag
> taggctactattcgtag
>
> Thanks for any help ...
Here is a way to do it without using regular expressions:
$ perl -le'
$DNA =
"cgttgctagctgctatcgatgtgctagtcgatgctagtgcatgcgtagtgcagtcatatgctaggcat";
$i = -1;
push @tags, $i while ( $i = index $DNA, "tag", $i + 1 ) >= 0;
die "tag not found\n" if @tags < 2;
for $x ( 0 .. $#tags - 1 ) {
for $y ( $x + 1 .. $#tags ) {
print substr( $DNA, $tags[$x], $tags[$y] - $tags[$x] ), "tag";
}
}
'
tagctgctatcgatgtgctag
tagctgctatcgatgtgctagtcgatgctag
tagctgctatcgatgtgctagtcgatgctagtgcatgcgtag
tagctgctatcgatgtgctagtcgatgctagtgcatgcgtagtgcagtcatatgctag
tagtcgatgctag
tagtcgatgctagtgcatgcgtag
tagtcgatgctagtgcatgcgtagtgcagtcatatgctag
tagtgcatgcgtag
tagtgcatgcgtagtgcagtcatatgctag
tagtgcagtcatatgctag
John
--
use Perl;
program
fulfillment
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]