On May 18, James Edward Gray II said:

>On May 18, 2004, at 9:30 AM, Andrew Gaffney wrote:
>
>> Doesn't the 'gc' modified make the whole think not as greedy? As a
>> side effect of continuation, doesn't it try to match as many times as
>> possible?
>
>I'm not familiar with this, but my gut reaction is no.  Perhaps on of
>the Regex experts can clear that up for us...

Correct.  No modifier to a regex changes the greediness of the quantifiers
in the regex.  All the /g modifier does is say:

  1. if the regex is in list context, match, and then try to match again
     following the first match, etc., until you stop
  2. if the regex is in scalar context, match and return, but remember
     where we left off -- the next time this regex is called with the /g
     modifier, we will pick up where stopped.  this position can also be
     used with the \G anchor.

Here are examples:

  my $str = "japhy knows regexes";

  @all_letters = $str =~ /\w/g;
  # @all_letters contains 17 elements: j,a,p,h,y,k,n,o,etc.
  # and before you ask, NO, I DON'T need parens around \w in there

  while ($str =~ /(\w+)/g) {
    print "Got: '$1'\n";  # Got: japhy; Got: knows; Got: regexes
  }

  if ($str =~ /(\w\w)/g) {
    print "Two letters: '$1'\n";  # 'ja'
    if ($str =~ /\G(.{5})/) {
      print "Next five characters: '$1'\n"; # 'phy k'
    }
  }

Once a /g match fails, \G is cleared (\G is linked to the pos() function;
that is, whatever pos($str) is equal to is the location in $str that \G
anchors to).

*ALL* that the /c modifier does (and it only matters when used with the /g
modifier) is tell the regex engine NOT to clear \G or pos() when a match
fails.  Here's a method called the inchworm:

  print "Got '$1'\n" while
    $str =~ /\G"([^"]*)"\s*/gc or
    $str =~ /\G'([^']*)'\s*/gc or
    $str =~ /\G(\S+)/gc;

This allows us to use $1 no matter which regex matches, and because all
three regexes have the /gc modifier, when the first one fails, it'll try
the second one, AT THE SAME LOCATION.

-- 
Jeff "japhy" Pinyan      [EMAIL PROTECTED]      http://www.pobox.com/~japhy/
RPI Acacia brother #734   http://www.perlmonks.org/   http://www.cpan.org/
CPAN ID: PINYAN    [Need a programmer?  If you like my work, let me know.]
<stu> what does y/// stand for?  <tenderpuss> why, yansliterate of course.


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to