On Sun 24 Sep, Hugo wrote:
> In <[EMAIL PROTECTED]>, Richard Proctor
> writes
> :
> :TomCs perl storm has:
> :
> :> Figure out way to do
> :>
> :> /$e1 $e2/
> :>
> :> safely, where $e1 might have '(foo) \1' in it.
> :> and $e2 might have '(bar) \1' in it. Those won't work.
> :
> :If e1 and e2 are qr// type things the answer might be to localise
> :the backref numbers in each qr// expression.
> :
> :If they are not qr//s it might still be possible to achieve if the
> :expansion of variables in regexes is done by the regex compiler it
> :could recognise this context and localise the backrefs.
> :
> :Any code like this is going to have real problem with $1 etc if used
> :later, use of assignment in a regex and named backrefs (RFC 112) would
> :make this a lot safer.
>
> I think it is reaonable to ask whether the current handling of qr{}
> subpatterns is correct:
>
> perl -wle '$a=qr/(a)\1/; $b=qr/(b).*\1/; /$a($b)/g and print join ":", $1,
> pos for "aabbac"' a:5
>
> I'm tempted to suggest it isn't; that the paren count should be local
> to each qr{}, so that the above prints 'bb:4'. I think that most people
> currently construct their qr{} patterns as if they are going to be
> handled in isolation, without regard to the context in which they are
> embedded - why else do they override the embedder's flags if not to
> achieve that?
This seams the right way to go
> The problem then becomes: do we provide a mechansim to access the
> nested backreferences outside of the qr{} in which they were referenced,
> and if so what syntax do we offer to achieve that? I don't have an answer
> to the latter, which tempts me to answer 'no' to the former for all the
> wrong reasons. I suspect (and suggest) that complication is the only
> reason we don't currently have the behaviour I suggest the rest of the
> semantics warrant - that backreferences are localised within a qr().
With the suggestions from RFC 112, with assignment within the regex and
named backreferences, this provides a solution for anyone trying to
get at a backref inside of a nested qr(), I think this is a reasonable way
forward.
> I lie: the other reason qr{} currently doesn't behave like that is that
> when we interpolate a compiled regexp into a context that requires it be
> recompiled, we currently ignore the compiled form and act only on the
> original string. Perhaps this is also an insufficiently intelligent thing
> to do.
>
> Hugo
>
Yes, this and MJDs comment about the reentrant regex engine. I will stick
this in an RFC in a few minutes.
Richard
--
[EMAIL PROTECTED]