Re: is \1 vs $1 a necessary distinction?
On Wed, 27 Sep 2000 10:34:48 -0500, Jonathan Scott Duff wrote: >If $1 could be made to work properly on the LHS of s///, I'd vote for >that being The Way. I disagree, because \1 is different from a variable $foo in at least two ways: * $foo is compiled into /$foo/ before anything is matched. \1 is a repetition of what was just matched; this is dynamic interpolation instead of static. * if $foo contains metacharacters, they are treated as metacharacters. for example, if $foo is "a.b", then /$foo/ can match "axb". /\1/, OTOH, can only match the LITERAL string that $1 captured. With $foo='a.b', /($foo)!$foo/ and /($foo)!\1/ will not match the same set of things. "\1" is more like equivalent to "\Q$1\E". Therefore, I don't want $1 on the LHS to be the standard syntax. -- Bart.
Re: is \1 vs $1 a necessary distinction?
Dave Storrs <[EMAIL PROTECTED]> writes: > On 27 Sep 2000, Piers Cawley wrote: > > > > Do we *want* to maintain \1? Why have two notations to do the > > > > I'm kind of curious about what happens when you want to do, say: > > > > if (m/(\S+)/) { > > $reg = qr{<(em|i|b)>($1)}; > > } > > > > where the $1 in the regex quote is refering to $1 from the previous > > regex match. > > Well, how about this: > > $reg = qr{<(em|i|b)>(${P1})}; > NOTE: ^ > > If you assume that $1 and ${1} are equivalent (which makes it > possible to have as many backrefs as you want), then you could say that, > if the first character after the { is a P, it means "in the previous regex > match." Oh good ghod. That is *vile*. -- Piers
Re: is \1 vs $1 a necessary distinction?
On Wed 27 Sep, Dave Storrs wrote: > > > On Wed, 27 Sep 2000, Richard Proctor wrote: > > > Both \1 and $1 refer to what is matched by the first set of parens in a > > > regex. AFAIK, the only difference between these two notation is that > > > \1 is used within the regex itself and $1 is used outside of the > > > regex. Is there any reason not to standardize these down to one > > > notation (i.e., eliminate one or the other)? > > > > I think this is fixable. > > The way you phrase that makes it sound that other people perceive > this as a problem as well, which gives me all sorts of warm fuzzies. :> > > > The only real need for this at the moment is to overcome limitations in > > the order of expansion of regexes. RFCs 112, 166, 276... all depend on > > fixing this. > > Ok, here's another question. How the _HELL_ does everyone else on > this bloody list keep track of every detail in every frigging RFC? Some > random comment comes up, and someone will go, "Oh, the third paragraph of > the second section in RFC 0x97A already mentioned this as a parenthetical > aside, despite the fact that its title and primary topic had no relation > to the issue." I still have (mumble-mumble) RFCs that I haven't even had > time to *read*, let alone memorize every detail of! In this context I was the author of guess what 112, 166 and 276 (though I admit to having to look up the number of the last one) > > Grr*grumble, grumble, moan, winge* > > Ok, back to rationality now. > > > If the regex compiler gets in before the expansion of the variables to > > make these work, it could handle $1 in all cases \1 can be retained for > > compatibility. > > Do we *want* to maintain \1? Why have two notations to do the > same thing when one is clearly superior? (\1 can only go up to \9 while > the other could theoretically go to ${...}.) Perl6 is breaking > backwards compatibility and eliminating all deprecated features...let's > get rid of \n as backreference notation. > The principle issue would be what to do about use of $1 on the LHS having its current meaning. Which is rather good for obfuscated code, but not terribly kind on normal programming. Note RFC 112 covers assignment within a regex naming rather than numbering the brackets one wishes to capture, it also covers named back references. Currently $1 is expanded by the quoting currently before the regex compiler gets to play, the regex compiler sees the \1 and knows what to do. \ meaning refer back I am reasonably happy with, the numbers I am not. Richard -- [EMAIL PROTECTED]
Re: is \1 vs $1 a necessary distinction?
On 27 Sep 2000, Piers Cawley wrote: > > Do we *want* to maintain \1? Why have two notations to do the > > I'm kind of curious about what happens when you want to do, say: > > if (m/(\S+)/) { > $reg = qr{<(em|i|b)>($1)}; > } > > where the $1 in the regex quote is refering to $1 from the previous > regex match. Well, how about this: $reg = qr{<(em|i|b)>(${P1})}; NOTE: ^ If you assume that $1 and ${1} are equivalent (which makes it possible to have as many backrefs as you want), then you could say that, if the first character after the { is a P, it means "in the previous regex match." Dave
Re: is \1 vs $1 a necessary distinction?
> "Jonathan" == Jonathan Scott Duff <[EMAIL PROTECTED]> writes: Jonathan> On Wed, Sep 27, 2000 at 08:15:53AM -0700, Dave Storrs wrote: >> Both \1 and $1 refer to what is matched by the first set of parens in a >> regex. AFAIK, the only difference between these two notation is that \1 >> is used within the regex itself and $1 is used outside of the regex. Is >> there any reason not to standardize these down to one notation (i.e., >> eliminate one or the other)? Jonathan> \1 can be used on the LHS of a s/// whereas $1 there probably won't do Jonathan> what you expect. Also, \1, \2, \3 only takes you as far as \9 ;-) Wrong. If you have more than 10 parens visible so far, \10 works just fine. Jonathan> If $1 could be made to work properly on the LHS of s///, I'd vote for Jonathan> that being The Way. It can't ever. It means $1 from the previous match. -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <[EMAIL PROTECTED]> http://www.stonehenge.com/merlyn/> Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
Re: is \1 vs $1 a necessary distinction?
Dave Storrs <[EMAIL PROTECTED]> writes: > On Wed, 27 Sep 2000, Richard Proctor wrote: > > > Both \1 and $1 refer to what is matched by the first set of parens in a > > > regex. AFAIK, the only difference between these two notation is that \1 > > > is used within the regex itself and $1 is used outside of the regex. Is > > > there any reason not to standardize these down to one notation (i.e., > > > eliminate one or the other)? > > > > I think this is fixable. > > The way you phrase that makes it sound that other people perceive > this as a problem as well, which gives me all sorts of warm fuzzies. :> > > >The only real need for this at the moment is to > > overcome limitations in the order of expansion of regexes. RFCs 112, 166, > > 276... all depend on fixing this. > > [...] > > >If the regex compiler gets in before the > > expansion of the variables to make these work, it could handle $1 in all cases > > \1 can be retained for compatibility. > > Do we *want* to maintain \1? Why have two notations to do the > same thing when one is clearly superior? (\1 can only go up to \9 while > the other could theoretically go to ${...}.) Perl6 is breaking > backwards compatibility and eliminating all deprecated features...let's > get rid of \n as backreference notation. I'm kind of curious about what happens when you want to do, say: if (m/(\S+)/) { $reg = qr{<(em|i|b)>($1)}; } while (<>) { next unless m{$reg}; ... } where the $1 in the regex quote is refering to $1 from the previous regex match. -- Piers
Re: is \1 vs $1 a necessary distinction?
> "DS" == Dave Storrs <[EMAIL PROTECTED]> writes: DS> Both \1 and $1 refer to what is matched by the first set of parens DS> in a regex. AFAIK, the only difference between these two notation DS> is that \1 is used within the regex itself and $1 is used outside DS> of the regex. Is there any reason not to standardize these down DS> to one notation (i.e., eliminate one or the other)? because $1 having be set previously will be interpolated INTO the new regex. so you have to have another notation to refer to grabbed stuff from the current regex. uri -- Uri Guttman - [EMAIL PROTECTED] -- http://www.sysarch.com SYStems ARCHitecture, Software Engineering, Perl, Internet, UNIX Consulting The Perl Books Page --- http://www.sysarch.com/cgi-bin/perl_books The Best Search Engine on the Net -- http://www.northernlight.com
Re: is \1 vs $1 a necessary distinction?
On Wed, 27 Sep 2000, Richard Proctor wrote: > > Both \1 and $1 refer to what is matched by the first set of parens in a > > regex. AFAIK, the only difference between these two notation is that \1 > > is used within the regex itself and $1 is used outside of the regex. Is > > there any reason not to standardize these down to one notation (i.e., > > eliminate one or the other)? > > I think this is fixable. The way you phrase that makes it sound that other people perceive this as a problem as well, which gives me all sorts of warm fuzzies. :> >The only real need for this at the moment is to > overcome limitations in the order of expansion of regexes. RFCs 112, 166, > 276... all depend on fixing this. Ok, here's another question. How the _HELL_ does everyone else on this bloody list keep track of every detail in every frigging RFC? Some random comment comes up, and someone will go, "Oh, the third paragraph of the second section in RFC 0x97A already mentioned this as a parenthetical aside, despite the fact that its title and primary topic had no relation to the issue." I still have (mumble-mumble) RFCs that I haven't even had time to *read*, let alone memorize every detail of! Grr*grumble, grumble, moan, winge* Ok, back to rationality now. >If the regex compiler gets in before the > expansion of the variables to make these work, it could handle $1 in all cases > \1 can be retained for compatibility. Do we *want* to maintain \1? Why have two notations to do the same thing when one is clearly superior? (\1 can only go up to \9 while the other could theoretically go to ${...}.) Perl6 is breaking backwards compatibility and eliminating all deprecated features...let's get rid of \n as backreference notation. Dave
Re: is \1 vs $1 a necessary distinction?
On Wed, 27 Sep 2000, Jonathan Scott Duff wrote: > If $1 could be made to work properly on the LHS of s///, I'd vote for > that being The Way. That was pretty much my thought?
Re: is \1 vs $1 a necessary distinction?
From: "Dave Storrs" <[EMAIL PROTECTED]> > Both \1 and $1 refer to what is matched by the first set of parens in a > regex. AFAIK, the only difference between these two notation is that \1 > is used within the regex itself and $1 is used outside of the regex. Is > there any reason not to standardize these down to one notation (i.e., > eliminate one or the other)? \1 came from sed and friends. I think an early driving force was maintaining familiarity with things like awk and sed. Even today there are still people that switch to and from other reg-ex languages. Emacs is the most common for me (though I still dabble with awk). I don't see a real advantage in taking out \1, and it is very likely to needlessly break legacy code, and additionally confuse various developers that have a habbit of using \1. On the other hand, the use of $1with substitutions is important for consistency. When you write s/../.../e, you're going to need to use a substitution variable, "\1" just doesn't fit. s/(...)/pre\1post/; works fine s/(...)/pre$1post/; is the question. I tend to use it only because I sometimes switch to: s/(...)/func() . "$1post"/e; for various reasons.. I just try and standardize on $1, but that's just me. Additionally the use of $1 in the matching reg-ex is ambiguous as in: m/(...).*?$1/; Does it refer to the internal set of (..), or does it mean the previous value of $1 before this match.. This becomes non-obvious to the observer in the following case: m/($keyword).*?$1/; Here, our mindset is substitution of external variables, the casual (non-seasoned) observer might not understand that it really means: m/($keyword).*?\1/; My argument is that both \1 and $1 have their places, and limiting to one type can be troublesome. Plus, TMTOWTDI. :) -Michael
Re: is \1 vs $1 a necessary distinction?
Dave, > Both \1 and $1 refer to what is matched by the first set of parens in a > regex. AFAIK, the only difference between these two notation is that \1 > is used within the regex itself and $1 is used outside of the regex. Is > there any reason not to standardize these down to one notation (i.e., > eliminate one or the other)? I think this is fixable. The only real need for this at the moment is to overcome limitations in the order of expansion of regexes. RFCs 112, 166, 276... all depend on fixing this. If the regex compiler gets in before the expansion of the variables to make these work, it could handle $1 in all cases \1 can be retained for compatibility. Richard
Re: is \1 vs $1 a necessary distinction?
On Wed, Sep 27, 2000 at 08:15:53AM -0700, Dave Storrs wrote: > Both \1 and $1 refer to what is matched by the first set of parens in a > regex. AFAIK, the only difference between these two notation is that \1 > is used within the regex itself and $1 is used outside of the regex. Is > there any reason not to standardize these down to one notation (i.e., > eliminate one or the other)? \1 can be used on the LHS of a s/// whereas $1 there probably won't do what you expect. Also, \1, \2, \3 only takes you as far as \9 ;-) If $1 could be made to work properly on the LHS of s///, I'd vote for that being The Way. -Scott -- Jonathan Scott Duff [EMAIL PROTECTED]
is \1 vs $1 a necessary distinction?
Both \1 and $1 refer to what is matched by the first set of parens in a regex. AFAIK, the only difference between these two notation is that \1 is used within the regex itself and $1 is used outside of the regex. Is there any reason not to standardize these down to one notation (i.e., eliminate one or the other)? Dave