Re: RFC 150 (v1) Extend regex syntax to provide for return of a hash of matched subpatterns

Richard Proctor Fri, 08 Sep 2000 11:19:14 -0700
On Fri 08 Sep, Kevin Walker wrote:
> (This thread has been inactive for a while.  See 
> http://www.mail-archive.com/perl6-language-regex@perl.org/index.html#0 
> 0015 for it's short history.)
> 
> Long ago Tom Christiansen wrote:
> 
> >This is useful in that it would stop being number dependent.
> >For example, you can't now safely say
> >
> >    /$var (foo) \1/
> >
> >and guarantee for arbitrary contents of $var that your you have
> >the right number backref anymore.
> >
> >If I recall correctly, the Python folks addressed this.  One
> >might check that.
> 
> Python does, indeed, have something similar.  See (?P...) and 
> (?P=...) at http://www.python.org/doc/current/lib/re-syntax.html .
> 
> Tom's comment points out a shortcoming in the original RFC:  There's 
> no way to make, by name, a backref to a named group.  I propose to 
> fix that in a revised version of RFC 150.  I don't have strong 
> feelings about what the syntax should be.  Here one idea:
> 
>    The substring matched by (?%some_name: ... ) can be referred to as 
> $%{some_name}.
> 
> That's kind of ugly, so other suggestions are welcome.  (The idea was 
> to do something analogous to $1, $2, etc.  Unfortunately ${some_name} 
> is already taken.  Maybe $_{some_name} would also work -- though if 
> %_ seems too valuable to use for this limited purpose.)
> 
> 

Kevin,

I have been having similar thoughts about my RFC 112 (assignment within
a regex).  At present it is worded that it does not generate the back
reference, but I now have some reservations.

Thinking about the comparision between the two RFCs there is some common
ground, but cases where people will want your hash and cases where
people will want explicit variables.  Using RFC 112, you can do
hash assignment, but it would not clear the hash beforehand whereas
your hash assignment would (I assume) set the hash to ONLY those elements
from the regex.

Your %hash = $string =~ /..(?%foo=..)/
is essentially the same as my %hash = (); $string =~ /..(?$hash{foo}=..)/

Do we need both?  I think the answer is prossibly, but whatever is
decided about back refereces should apply to both.

My thoughts on the back references would be, that if a variable is used
again later in the regex, assignment takes place and it is simply refered
to.

Thus $string =~ m#<(?$foo=\w+).*?</$foo>#;

The parse notices the reuse of $foo and performs the actual assigment
as and when the foo is matched (or at least acts as if it does).

Richard


-- 

[EMAIL PROTECTED]
Re: RFC 150 (v1) Extend regex syntax to provide for return of a hash of matched subpatterns

Reply via email to