On May 25, Jay Savage said:

On 5/25/05, Jeff 'japhy' Pinyan <[EMAIL PROTECTED]> wrote:
On May 25, Jay Savage said:

 /e(?{push @bar, pos})/g;

should work, but seems to ignore the /g.

Because as you wrote it, the regex is in void context, which means it'll
only match once.  Put it in list context:

   () = /e(?{ push @bar, pos })/g;

But this looks weird to almost anyone.  I'd do:

   /e(?{ push @bar, pos })(?!)/;

Thanks.  this makes sense.  But why does the zero-width lookahead
force list context?  And why does /g by itself force list context for
s/// with the same search string?

First of all, /g does not cause any context. It behaves differently in scalar context than list context, though.

If you've noticed, I didn't need a /g modifier. There is no list context in my (?!) example, it's still void context. The reason it works is because the (?!) causes the regex to fail, so it backtracks, and then tries matching at a different location in the string.

s///g does NOT behave differently no matter what context it's in. It always does ALL substitutions.

If 'e' is more than one character, you'll need to use

   /(?>pattern)(?{ push @bar, pos })(?!)/;

Assuming that the patter to matc on = "pattern", this works, too:

  /p(?:{push @bar, pos})attern(?!)/g

Well, I was being general. I didn't mean 'pattern', that was just a placeholder. If the pattern wasn't under your control, you couldn't split the first character off like that. One example would be:

  /abc|def/

You couldn't do

  /(?:a|d)(?{ push @bar, pos })(?:bc|ef)/

for a few reasons. First, that could match aef or dbc in addition to abc or def. Second, you're doing the (?{ ... }) before you know the entire regex has matched!

Thus:

  /(?>abc|def)(?{ push @bar, $-[0] })/

for a zero-based value.

You could also use $-[0] instead of pos().

That depends on the context. $-[0] holds the value of the beginning of
the current match.  pos() returns the position at the end of the

Not the *end*, but the *current* position. Then again, you could call that the end, since that's as far as you've gotten...

current match, i.e. $+[0], and is what the OP used.   $-[0] = pos() -
length($&), more or less. For single-character matches, this has the
effect of making values returned from $-[0] zero indexed, and values

I'd say zero-indexing is better, since that's how the other string functions work.

returned from pos() one indexed. (That's also why any (?{}) expression
with pos needs to be inserted after the frist character of a
multi-character match.)

--
Jeff "japhy" Pinyan         %  How can we ever be the sold short or
RPI Acacia Brother #734     %  the cheated, we who for every service
http://japhy.perlmonk.org/  %  have long ago been overpaid?
http://www.perlmonks.org/   %    -- Meister Eckhart

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to