Sharan Basappa wrote:

Hi All,

I have some background working with scanners built from Flex. And I have
used lookahead capability of flex many a times. But I dont understand the
meaning of ZERO in zero lookahead match rule i.e. (?=pattern)

For example, to capture overlapping 3 digit patterns from string $str =
123456
I use the regex @store = $str =~ m/(?=(\d\d\d))/g;
So here the regex engine actually looks ahead by chars digits.

As far as lookahead expressions are concerned, Perl functions identically to
Flex. It is called zero-width lookahead because it matches a zero-width
/position/ in the string instead of a sequence of characters. If I write

'123456' =~ /\d\d\d(...)/

then '456' will be captured as the first three characters were consumed by the
preceding pattern. However if I write

'123456' =~ /(?=\d\d\d)(...)/

then '123' will be captured instead because the lookahead pattern has zero 
width.

The other question I have is - how does regex engine decide that it has to
move further its scanner by 1 character everytime since I get output 123 234
345 456
when I run this script ?

The engine moves as far through your target string as it needs to to find a new
match. If I write

'1B3D5F' =~ /(?=(.\d.))/g;

then the engine will find a match at only every second character, and if I use
a much simpler zero-width match, just

'ABCDEF' =~ //g

then the regex will match seven times - at the beginning and end and between
every pair of characters - so the more complex zero-width match you have written
will match at all of the those places as long as there are three digits 
following.

HTH,

Rob


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/


Reply via email to