Nkuipers wrote:

> Hello,
> 
> I am trying to get the positions of every instance of a given substring
> within a given superstring (DNA sequence), and currently have a
> 
> while ( $super_string =~ m/${sub_string}/gi ) { ... }
> 
> construct.
> 
> I was under the impression that the regex transmission would bump along
> every character and try to match, backtracking even after success to try
> the
> character after the first character from the successful match.  However,
> The Camel 3rd says that "used in a scalar context, the /g modifier...makes
> Perl start the next match on the same variable at a position just past
> where the
> last one stopped"(p151).  This is obviously inadequate in cases where one
> match may commence within another.
> 
> Suggestions?

if i understand what your problem is, you are saying if:

my $i = 'aa';
my $j = 'aaaaaaaaaaaa';

how many times does $i occure within $j right?

you are saying it could be 6 because $j is essentially:

aa aa aa aa aa aa

but it could also be 11 because when you backtracking, it counts the 
previous matches last 'a' twice for the next match.

i suggest that you don't have to look at Perl's reg. expression engine this 
way. the backtracking nature of Perl's reg. algr. has other purpose such as 
when the '*' or '+' quantifier is encountered.

if you simply want to know how many times a sub string is within a 
longer/larger string, try the index() or rindex() function that Perl 
supply.

david

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to