Cedric wrote:
> 
> First of all, sorry for my poor english.
> I work with DNA sequence and want to extract relevant motives of course
> using regular expression.
> A concrete example will be better than a long description of my problem, so:
> 
> Here is a string corresponding to my DNA strand :
> 
> cgttgctagctgctatcgatgtgctagtcgatgctagtgcatgcgtagtgcagtcatatgctaggcat
> 
> I want to extract all the substrings beginning with tag and finishing with
> tag including substrings with same start point but different length like :
> 
> tagctgctatcgatgtgctag
> tagctgctatcgatgtgctagtcgatgctag
> tagctgctatcgatgtgctagtcgatgctagtgcatgcgtag
> 
> How to write a regular expression which will not increase the value pos()
> after finding one match but keep the same value of pos() and extract all
> sequences with different lengths ?
> 
> To sum up, if I have this sequence :
> 
> taggcgttatcgctagcgcatcgataggctactattcgtagcc
> 
> My regular expression must extract :
> 
> taggcgttatcgctag
> taggcgttatcgctagcgcatcgatag
> taggcgttatcgctagcgcatcgataggctactattcgtag
>              tagcgcatcgatag
>              tagcgcatcgataggctactattcgtag
>                         taggctactattcgtag
> 
> Thanks for any help ...


Here is a way to do it without using regular expressions:

$ perl -le'
$DNA =
"cgttgctagctgctatcgatgtgctagtcgatgctagtgcatgcgtagtgcagtcatatgctaggcat";

$i = -1;
push @tags, $i while ( $i = index $DNA, "tag", $i + 1 ) >= 0;

die "tag not found\n" if @tags < 2;

for $x ( 0 .. $#tags - 1 ) {
    for $y ( $x + 1 .. $#tags ) {
        print substr( $DNA, $tags[$x], $tags[$y] - $tags[$x] ), "tag";
        }
    }
'
tagctgctatcgatgtgctag
tagctgctatcgatgtgctagtcgatgctag
tagctgctatcgatgtgctagtcgatgctagtgcatgcgtag
tagctgctatcgatgtgctagtcgatgctagtgcatgcgtagtgcagtcatatgctag
tagtcgatgctag
tagtcgatgctagtgcatgcgtag
tagtcgatgctagtgcatgcgtagtgcagtcatatgctag
tagtgcatgcgtag
tagtgcatgcgtagtgcagtcatatgctag
tagtgcagtcatatgctag



John
-- 
use Perl;
program
fulfillment

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to