Mailing list indexing project
I'm working on an annotated version of the mailing list so that old postings can be more easily researched. My very primitive implementation is: http://www.ajs.com/~ajs/cgi-bin/p6l-index.cgi The input datafile is: http://www.ajs.com/~ajs/p6l.dat I'm using Google Groups as a back-end of sorts (all of the links go to them for content). Does anyone have any thoughts on how best to proceed? Anyone want to host this thing once it's ready? Indexing is, as always, a massive undertaking, but I figure someone needs to do it. Well, I have to get up at 5, so goodnight all!
character classes in p6 rules
I now have a basic implementation for enumerated character classes in the grammar engine (i.e., <[xyz]>, <-[xyz]>, <[x..z]>, and <-[x..z]>). I didn't see it specified anywhere, but are the \d, \D, \s, \S, etc. metacharacters still supposed to work inside of a enumerated character class, as they do in Perl 5? Or in p6 do we always use <++[xyz]>, <->, <+>, <->, etc.? (Yes, I know that normally the absence of any spec to the contrary indicates that we're still using p5 semantics, but this one is worth verification for me.) While I'm on the subject, let me just ramble a bit -- there are times when , , , etc. give me a bad feeling -- they look a little too much like subrules to me, especially when looking at <+> and the like. I keep wondering about things like <+> and <->. And something like C<< rx / * / >> may generate a lot of not-very-useful one-character captures into $/ , so that we'll typically want to get in the habit of writing rx / * / rx / <+>* / and then have the engine recognize when this occurs so it can optimize to a much faster character class op rather than a lot of calls to a separate subrule. Plus, <+> just looks plain ugly and unbalanced to me. Somehow I'd like to get rid of those inner angles, so that we always use <+alpha>, <+digit>, <-sp>, <-punct> to indicate named character classes, and specify combinations with constructions like <+alpha+punct-[aeiou]> and <+word-[_]>. We'd still allow <[abc]> as a shortcut to <+[abc]>. To me this looks cleaner overall, makes it clear we're doing a one-character non-capturing match, and may enable a few optimization possibilities. (I'm sure that with enough effort we can get equivalent optimizations out of the existing syntax, and we may need them anyway in the long run, but this might simplify that a fair bit.) I haven't thought far ahead to the question of whether character classes would continue to occupy the same namespace as rules (as they do now) or if they become specialized kinds of rules or what. I'll just leave it at this for now and see what the rest of p6l thinks. Pm
Re: (1,(2,3),4)[2]
On Wed, May 11, 2005 at 06:24:38PM -0400, Aaron Sherman wrote: : I'm confused as well. How does that play with Larry's comment: : : http://groups-beta.google.com/group/perl.perl6.language/browse_frm/thread/54a1135c012b97bf/d17b4bc5ae7db058?q=list+comma&rnum=5&hl=en#d17b4bc5ae7db058 Well, that approach would also work. I'm just not sure it's what people would expect. It's a little retroactive for .[] to change the context of the expressions in (). These days I think I'd rather have something out front that specifically says you want a list of scalars. Perhaps scalar(1,(2,3),4) should be our list of scalars context, and produce [1,[2,3],4]. Or maybe something else should indicate LoS. Larry
Re: (1,(2,3),4)[2]
On Thu, May 12, 2005 at 07:04:48AM +0800, Autrijus Tang wrote: : Please sanity-check. :-) Looks good to me. Though that should perhaps not be confused with sanity. Larry
Re: Nested captures
> That's a very interesting generalization. There are plenty of *other* cases where one wants an ordinal, or some other kind of $n-1 value. If C (and C, C, C) was a "subtract one" operator, you could write: my $n = prompt "How many elems? "; print @array[1st..($n)th]; instead of the out-by-one-prone: print @array[0..$n-1] Not to be insolent here, but could it be that the C ops be 'smart' enough to recognize the various zero based (vs one based?) situations and behave accordingly? Something like checking the (deprecated) $[ to see if they are supposed to be n or n -1? a Andy Bach, Sys. Mangler Internet: [EMAIL PROTECTED] VOICE: (608) 261-5738 FAX 264-5932 "Outside of a dog, a book is man's best friend. Inside of a dog, its too dark to read." Groucho Marx
Re: single element lists
On Wed, May 11, 2005 at 11:45:12AM -0500, Jonathan Scott Duff wrote: > We're discussing the proper semantics of (1)[0] on #perl6. Here's > where we're at so far: > > 1. specialise ()[] to parse as (,)[] > 2. scalars are singleton lists, so ()[] naturally > 3. make (1)[0] die horribly. (1)[0] means 1[0], which is probably undefined, so it dies. That could be detected at compile time. (my $foo = 1)[0] means $foo[0], which would die at runtime, unless there's type inference going on. In any case, I don't see a List in ()[] without a list-creating expression in the (). > -Scott -- wolverian signature.asc Description: Digital signature
Re: (1,(2,3),4)[2]
(reformatted to keep initialization with test) Autrijus Tang skribis 2005-05-12 7:04 (+0800): > my @a = (1,2,[3,4]); > is([EMAIL PROTECTED], 3, 'Array length, nested []'); ok 1 > my $a = (1,2,[3,4]); > is(+$a, 3, 'Array ref length, nested []'); ok 2 > my @b = [1,2,[3,4]]; > is([EMAIL PROTECTED], 1, 'Array length, nested [], outer []s'); ok 3 > my $b = [1,2,[3,4]]; > is(+$b, 3, 'Array ref length, nested [], outer []s'); ok 4 > my @c = (1,2,(3,4)); > is(+$c, 4, 'Array ref length, nested ()'); ok 5 > my $c = (1,2,(3,4)); > is([EMAIL PROTECTED], 4, 'Array length, nested ()'); ok 6 > my @d = [1,2,(3,4)]; > is([EMAIL PROTECTED], 1, 'Array length, nested (), outer []s'); ok 7 > my $d = [1,2,(3,4)]; > is(+$d, 4, 'Array ref length, nested (), outer []s'); ok 8 > Please sanity-check. :-) All sane! :) Juerd -- http://convolution.nl/maak_juerd_blij.html http://convolution.nl/make_juerd_happy.html http://convolution.nl/gajigu_juerd_n.html
Re: (1,(2,3),4)[2]
On Wed, May 11, 2005 at 03:00:15PM -0600, Luke Palmer wrote: > On 5/11/05, Autrijus Tang <[EMAIL PROTECTED]> wrote: > > In a somewhat related topic: > > > > pugs> (1,(2,3),4)[2] > > 4 > > > > Because the invocant to .[] assumes a Singular context. > > Right, but the *inside* of the invocant is still a list, so it's in > list context. I think that line should return 3. Okay. Here is the current (all-passing) t/data_types/nested_arrays.t: { my @a = (1,2,[3,4]); my $a = (1,2,[3,4]); my @b = [1,2,[3,4]]; my $b = [1,2,[3,4]]; my @c = (1,2,(3,4)); my $c = (1,2,(3,4)); my @d = [1,2,(3,4)]; my $d = [1,2,(3,4)]; is([EMAIL PROTECTED], 3, 'Array length, nested []'); is(+$a, 3, 'Array ref length, nested []'); is([EMAIL PROTECTED], 1, 'Array length, nested [], outer []s'); is(+$b, 3, 'Array ref length, nested [], outer []s'); is(+$c, 4, 'Array ref length, nested ()'); is([EMAIL PROTECTED], 4, 'Array length, nested ()'); is([EMAIL PROTECTED], 1, 'Array length, nested (), outer []s'); is(+$d, 4, 'Array ref length, nested (), outer []s'); } Please sanity-check. :-) Thanks, /Autrijus/ pgpv0uPyXKiMu.pgp Description: PGP signature
Re: Nested captures
Autrijus Tang wrote: On Thu, May 12, 2005 at 12:37:06AM +0200, Fagyal Csongor wrote: Damian Conway wrote: print @array[1st..($n)th]; Sounds cool, but what about $n = 0; ? Then it would be 0..-1, an empty range. Yep, but I mean in general isn't it confusing that the 0th element is actually the -1st (in Perl5 sense)? Does this mean that the -1st (Perl6) element would be the -2nd (Perl5)? Or are (-n)th elements invalid? - Fagzal
Re: Nested captures
On Thu, May 12, 2005 at 12:37:06AM +0200, Fagyal Csongor wrote: > Damian Conway wrote: > > >print @array[1st..($n)th]; > > Sounds cool, but what about $n = 0; ? Then it would be 0..-1, an empty range. /Autrijus/ pgpW4KeLIp7hR.pgp Description: PGP signature
Re: Nested captures
Damian Conway wrote: print @array[1st..($n)th]; Sounds cool, but what about $n = 0; ? - Fagzal
Re: (1,(2,3),4)[2]
On Wed, 2005-05-11 at 17:48, Matt Fowles wrote: > On 5/11/05, Luke Palmer <[EMAIL PROTECTED]> wrote: > > On 5/11/05, Autrijus Tang <[EMAIL PROTECTED]> wrote: > > > In a somewhat related topic: > > > > > > pugs> (1,(2,3),4)[2] > > > 4 > > > > > > Because the invocant to .[] assumes a Singular context. > > > > Right, but the *inside* of the invocant is still a list, so it's in > > list context. I think that line should return 3. > > I am confused as to why exactly this is the case. Are you saying that > nested lists like this flatten? That would certainly catch me off > guard. Would you mind explaining that to me a little more? I'm confused as well. How does that play with Larry's comment: http://groups-beta.google.com/group/perl.perl6.language/browse_frm/thread/54a1135c012b97bf/d17b4bc5ae7db058?q=list+comma&rnum=5&hl=en#d17b4bc5ae7db058 -- Aaron Sherman <[EMAIL PROTECTED]> Senior Systems Engineer and Toolsmith "It's the sound of a satellite saying, 'get me down!'" -Shriekback
Re: Nested captures
Larry mused: I'm wondering if it's just a cardinal/ordinal thing, and we can just translate $7 to $<7th>. Then we don't have to guess where to insert a .flat or :flat. That's a very interesting generalization. There are plenty of *other* cases where one wants an ordinal, or some other kind of $n-1 value. If C (and C, C, C) was a "subtract one" operator, you could write: my $n = prompt "How many elems? "; print @array[1st..($n)th]; instead of the out-by-one-prone: print @array[0..$n-1] Hmmm. Damian
Re: Nested captures
Larry decreed: Let's go 0-based and make $0 =:= $/[0] so that $/[] is all the parens. Huzzah! Our old $0 (P5's $&) could be $<> instead, short for $ or some such. According to the new capture semantics document posted earlier this week: A successful match returns a C object whose ... string value is the complete substring that was matched by the entire rule... So the old $0 is just ~$/ Damian
Re: single element lists
On Thu, May 12, 2005 at 05:19:11AM +0800, Autrijus Tang wrote: : Sure (and done). Now that #1 is eliminated, the question is now : whether a simple scalar can be treated as a small (one-element) array : reference, much like a simple pair can be treated as a small : (one-element) hash reference. : : 1.[0]; # evaluates to 1? : : If yes, then (1)[0] means the same as 1.[0] and 1.[0][0][0]. If no, : (1)[0] is a runtime error just like 1.[0] -- i.e. unable to find the : matching .[] multisub under Int or its superclasses. Maybe we should just let someone poke a Subscriptable role into some class or other to determine the behavior if they care. Larry
Re: (1,(2,3),4)[2]
All~ On 5/11/05, Luke Palmer <[EMAIL PROTECTED]> wrote: > On 5/11/05, Autrijus Tang <[EMAIL PROTECTED]> wrote: > > In a somewhat related topic: > > > > pugs> (1,(2,3),4)[2] > > 4 > > > > Because the invocant to .[] assumes a Singular context. > > Right, but the *inside* of the invocant is still a list, so it's in > list context. I think that line should return 3. I am confused as to why exactly this is the case. Are you saying that nested lists like this flatten? That would certainly catch me off guard. Would you mind explaining that to me a little more? Thanks, Matt -- "Computer Science is merely the post-Turing Decline of Formal Systems Theory." -???
Re: single element lists
On Wed, May 11, 2005 at 02:12:41PM -0700, Larry Wall wrote: > On Thu, May 12, 2005 at 04:19:02AM +0800, Autrijus Tang wrote: > : Hm? Under #2, no matter whether @foo is (1) or (1,2), the construct > : (@foo)[0] would always means @foo.[0]. Not sure how the length of @foo > : matters here. > > Tell you what, let's require P5's (...)[] to be translated to [...][], > so (...)[] should assume scalar context that will return some kind of > array reference. (What Luke said about (1,(2,3),4)[] still holds, though. > Commas create lists, and lists by default impose list context, and > parens are only for grouping in lists, not scalarifiying.) Sure (and done). Now that #1 is eliminated, the question is now whether a simple scalar can be treated as a small (one-element) array reference, much like a simple pair can be treated as a small (one-element) hash reference. 1.[0]; # evaluates to 1? If yes, then (1)[0] means the same as 1.[0] and 1.[0][0][0]. If no, (1)[0] is a runtime error just like 1.[0] -- i.e. unable to find the matching .[] multisub under Int or its superclasses. Thanks, /Autrijus/ pgpsJazzKv3Tb.pgp Description: PGP signature
Re: single element lists
On Thu, May 12, 2005 at 04:19:02AM +0800, Autrijus Tang wrote: : Hm? Under #2, no matter whether @foo is (1) or (1,2), the construct : (@foo)[0] would always means @foo.[0]. Not sure how the length of @foo : matters here. Tell you what, let's require P5's (...)[] to be translated to [...][], so (...)[] should assume scalar context that will return some kind of array reference. (What Luke said about (1,(2,3),4)[] still holds, though. Commas create lists, and lists by default impose list context, and parens are only for grouping in lists, not scalarifiying.) Larry
Re: (1,(2,3),4)[2]
On Wed, May 11, 2005 at 03:00:15PM -0600, Luke Palmer wrote: > On 5/11/05, Autrijus Tang <[EMAIL PROTECTED]> wrote: > > In a somewhat related topic: > > > > pugs> (1,(2,3),4)[2] > > 4 > > > > Because the invocant to .[] assumes a Singular context. > > Right, but the *inside* of the invocant is still a list, so it's in > list context. I think that line should return 3. You're totally right, and I was clearly wahnsinnig. It now returns 3. Thanks, /Autrijus/ pgpkR6nPgs7mm.pgp Description: PGP signature
Re: (1,(2,3),4)[2]
On 5/11/05, Autrijus Tang <[EMAIL PROTECTED]> wrote: > In a somewhat related topic: > > pugs> (1,(2,3),4)[2] > 4 > > Because the invocant to .[] assumes a Singular context. Right, but the *inside* of the invocant is still a list, so it's in list context. I think that line should return 3. Luke
(1,(2,3),4)[2]
In a somewhat related topic: pugs> (1,(2,3),4)[2] 4 Because the invocant to .[] assumes a Singular context. I'm not sure how any invocant can assume a Plural context anyway, so this behaviour seems correct. Is it, though? :) Thanks, /Autrijus/ pgpihJttxQxy9.pgp Description: PGP signature
volunteer wanted for xml grammar thingy
Three years ago I wrote a simple Perl 5 script to convert the EBNF specification of XML to Perl 6's rules. Pugs supports rules now, so perhaps it can be tested. This is a complex job (because it's a complex grammar, and of course it can never work without much tweaking, and debugging grammars is probably hard), that I won't be able ta handle. So now I'm looking for someone who wants to do this. It doesn't have to be done, but I think it'd be really nice to show the world we can "parse" XML already. It's by no means a useful parsing, but I do think it's a good stress test of PGE. Everything can be found at http://perlmonks.org/index.pl?node_id=179755 Who wants to give this a serious try? Juerd -- http://convolution.nl/maak_juerd_blij.html http://convolution.nl/make_juerd_happy.html http://convolution.nl/gajigu_juerd_n.html
Re: single element lists
On Wed, May 11, 2005 at 01:11:45PM -0700, Larry Wall wrote: > On Wed, May 11, 2005 at 11:45:12AM -0500, Jonathan Scott Duff wrote: > : > : We're discussing the proper semantics of (1)[0] on #perl6. Here's > : where we're at so far: > : > : 1. specialise ()[] to parse as (,)[] > : 2. scalars are singleton lists, so ()[] naturally > : 3. make (1)[0] die horribly. > : > I think it has to be #1. We can't have (@foo)[0] working at two different > indirection levels depending on whether @foo == 1. Hm? Under #2, no matter whether @foo is (1) or (1,2), the construct (@foo)[0] would always means @foo.[0]. Not sure how the length of @foo matters here. Thanks, /Autrijus/ pgpH7KUWYIW8C.pgp Description: PGP signature
Re: single element lists
On Wed, May 11, 2005 at 11:45:12AM -0500, Jonathan Scott Duff wrote: : : We're discussing the proper semantics of (1)[0] on #perl6. Here's : where we're at so far: : : 1. specialise ()[] to parse as (,)[] : 2. scalars are singleton lists, so ()[] naturally : 3. make (1)[0] die horribly. : : We all seem to agree that #3 is least useful and probably wrong. But : there's a divide between whether #1 or #2 is the "right" behavior. : : #2 implies that (1)[0][0][0][0] == 1 : #1 means that (1)[0] == 1 and (1)[0][0] is an error : : FWIW, I'm in favor of #1 : : What does p6l think? (What does @Larry think?) I think it has to be #1. We can't have (@foo)[0] working at two different indirection levels depending on whether @foo == 1. Larry
Re: Nested captures
On Wed, May 11, 2005 at 06:35:36PM +0200, Juerd wrote: : Larry Wall skribis 2005-05-11 8:30 (-0700): : > It's already the case that p5-to-p6 is going to have a *wonderful* : > time translating $7 to $1[2][0]... : : If I remember correctly, ** recursively flattens, and so (**$/)[7-1] : should work. It doesn't. It just does one level of flattening, but it does it *now*. : And otherwise a simple method can probably do the trick. I suggest : $/.platalseendubbeltje.[7-1], but probably only Dutch people can : appreciate that. That works with some translated constructs but not others. I'm wondering if it's just a cardinal/ordinal thing, and we can just translate $7 to $<7th>. Then we don't have to guess where to insert a .flat or :flat. Larry
Re: Nested captures
> On Wed, May 11, 2005 at 05:48:59PM +1000, Damian Conway wrote: > : But that's only the opinion of one(@Larry), not of $Larry. > > Let's go 0-based and make $0 =:= $/[0] so that $/[] is all the parens. > Our old $0 (P5's $&) could be $<> instead, short for $ or some > such. Why can't bare $/ just striingify to the whole match? > It's already the case that p5-to-p6 is going to have a *wonderful* > time translating $7 to $1[2][0]... Not a real problem. Patrick has already said that his plan is that :p5 REs will return a match object with an already flattened match list using perl5 left peren counting semantics. > I wonder how much call there will be for a rule option that uses P6 > syntax but P5 paren binding with "push" semantics. Just add a :flat -- Mark Biggar [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED]
Re: single element lists
On Wed, 2005-05-11 at 12:45, Jonathan Scott Duff wrote: > We're discussing the proper semantics of (1)[0] on #perl6. Here's > where we're at so far: > > 1. specialise ()[] to parse as (,)[] > 2. scalars are singleton lists, so ()[] naturally > 3. make (1)[0] die horribly. It may or may not help, but I direct your attention to: http://groups-beta.google.com/group/perl.perl6.language/browse_frm/thread/24ef8f421548b806/f119fc38427f9f3b?q=comma+one+element&rnum=2#f119fc38427f9f3b -- Aaron Sherman <[EMAIL PROTECTED]> Senior Systems Engineer and Toolsmith "It's the sound of a satellite saying, 'get me down!'" -Shriekback
Re: single element lists
My perspective from PDL is that "(1)[0][0][0]"..."[0]" should evaluate to 1. The artificial distinction between a scalar and an array of length 1 (in each dimension) is the source of endless hassles, and it's a pretty simple DWIM to allow indexing of element 0 of any unused dimension. That makes it much, much easier to write code that generalizes, because you don't have to check for the scalar case -- you can always assume that any given dimensional axis will always have one value along it (*), and loop/thread/whatever along that dimension. (*) Of course, that's assuming that there's at least one value in the list as a whole -- PDL does, and perl 6 should, support zero-element lists; any looping or threading construct on a zero-element list is a no-op. Jonathan Scott Duff wrote: We're discussing the proper semantics of (1)[0] on #perl6. Here's where we're at so far: 1. specialise ()[] to parse as (,)[] 2. scalars are singleton lists, so ()[] naturally 3. make (1)[0] die horribly. We all seem to agree that #3 is least useful and probably wrong. But there's a divide between whether #1 or #2 is the "right" behavior. #2 implies that (1)[0][0][0][0] == 1 #1 means that (1)[0] == 1 and (1)[0][0] is an error FWIW, I'm in favor of #1 What does p6l think? (What does @Larry think?) -Scott
Re: single element lists
On 5/11/05, Juerd <[EMAIL PROTECTED]> wrote: > Jonathan Scott Duff skribis 2005-05-11 11:45 (-0500): > > 1. specialise ()[] to parse as (,)[] > > 2. scalars are singleton lists, so ()[] naturally > > 3. make (1)[0] die horribly. > > #2 implies that (1)[0][0][0][0] == 1 > > #1 means that (1)[0] == 1 and (1)[0][0] is an error > > #1 also means that ($aref)[0] is $aref, rather than $aref[0]. > > I pick #2, also because I think being able to pass scalars where arrays > are expected simplifies other parts of the language as well. #2 is also more intuitive and comfortable. () in P5 constructs a list, which is why (split ' ', $foo)[0] works so nicely.
Re: Nested captures
On Wed, May 11, 2005 at 12:01:35PM -0500, Patrick R. Michaud wrote: > Of course, this now begs the question -- where are things stored > after doing ... ? > > rx :perl5 / (don't) (ray) (me) (for solar) / > > My guess is that within the rule they're $1, $2, $3, etc. as before, Within the rule, $1 and $2 should still refer to the previous match, as in P5 rules they are spelled as \1 and \2. Of course, with P5 rules, they would not be written as \0 and \1 simply because they are running in Perl 6. However, the subst part of s/// may look weird: pugs> $_ = "foo foo"; s:P5/(\w+) \1/--$0--/; $_ '--foo--' But it's probably unavoidable. Thanks, /Autrijus/ pgp9znrUTy3P8.pgp Description: PGP signature
Re: Nested captures
On Wed, May 11, 2005 at 06:37:53PM +0200, Juerd wrote: > Larry Wall skribis 2005-05-11 8:30 (-0700): > > It's already the case that p5-to-p6 is going to have a *wonderful* > > time translating $7 to $1[2][0]... > > Or maybe it just has to change "(" to "$1 := (", the second "(" to "$2 > := (", etc. More likely "$1:=[", "$2:=[", etc., to avoid the nested capture contexts. Slightly trickier is going to be handling of quantified captures, since $1 has to somehow be translated into $0 for unquantified captures and $0[-1] for quantified ones. Or we have a method in the match object that can do that for us. And in some cases it might be better to just stick :perl5 on the rule and not translate it. :-) Of course, this now begs the question -- where are things stored after doing ... ? rx :perl5 / (don't) (ray) (me) (for solar) / My guess is that within the rule they're $1, $2, $3, etc. as before, but in the match object they're $/[0], $/[1], $/[2], so that we can still properly do: ($c, $d, $e, $fg) = rx :perl5 / (don't) (ray) (me) (for solar) /; Or perhaps $1, $2, $3, etc become "smart aliases" into the match object, that somehow know what they're supposed to reference based on the rule that produced it. I.e., they're $/[1], $/[2], $/[3] for perl 6 rules and $/[0], $/[1], $/[2] for :perl5 rules. Or perhaps that just leads to total madness... Pm
Re: single element lists
Jonathan Scott Duff skribis 2005-05-11 11:45 (-0500): > 1. specialise ()[] to parse as (,)[] > 2. scalars are singleton lists, so ()[] naturally > 3. make (1)[0] die horribly. > #2 implies that (1)[0][0][0][0] == 1 > #1 means that (1)[0] == 1 and (1)[0][0] is an error #1 also means that ($aref)[0] is $aref, rather than $aref[0]. I pick #2, also because I think being able to pass scalars where arrays are expected simplifies other parts of the language as well. Juerd -- http://convolution.nl/maak_juerd_blij.html http://convolution.nl/make_juerd_happy.html http://convolution.nl/gajigu_juerd_n.html
Re: single element lists
Jonathan Scott Duff wrote: What does p6l think? (What does @Larry think?) I favor #3 as syntax error. But note $TSa == all( none(@Larry), one($p6l) ) or so :) -- TSa (Thomas Sandlaß)
single element lists
We're discussing the proper semantics of (1)[0] on #perl6. Here's where we're at so far: 1. specialise ()[] to parse as (,)[] 2. scalars are singleton lists, so ()[] naturally 3. make (1)[0] die horribly. We all seem to agree that #3 is least useful and probably wrong. But there's a divide between whether #1 or #2 is the "right" behavior. #2 implies that (1)[0][0][0][0] == 1 #1 means that (1)[0] == 1 and (1)[0][0] is an error FWIW, I'm in favor of #1 What does p6l think? (What does @Larry think?) -Scott -- Jonathan Scott Duff [EMAIL PROTECTED]
Re: Nested captures
Larry Wall skribis 2005-05-11 8:30 (-0700): > It's already the case that p5-to-p6 is going to have a *wonderful* > time translating $7 to $1[2][0]... Or maybe it just has to change "(" to "$1 := (", the second "(" to "$2 := (", etc. Juerd -- http://convolution.nl/maak_juerd_blij.html http://convolution.nl/make_juerd_happy.html http://convolution.nl/gajigu_juerd_n.html
Re: Nested captures
Larry Wall skribis 2005-05-11 8:30 (-0700): > It's already the case that p5-to-p6 is going to have a *wonderful* > time translating $7 to $1[2][0]... If I remember correctly, ** recursively flattens, and so (**$/)[7-1] should work. And otherwise a simple method can probably do the trick. I suggest $/.platalseendubbeltje.[7-1], but probably only Dutch people can appreciate that. Juerd -- http://convolution.nl/maak_juerd_blij.html http://convolution.nl/make_juerd_happy.html http://convolution.nl/gajigu_juerd_n.html
Re: Nested captures
On Thu, May 12, 2005 at 12:06:57AM +0800, Autrijus Tang wrote: > On Wed, May 11, 2005 at 08:30:42AM -0700, Larry Wall wrote: > > On Wed, May 11, 2005 at 05:48:59PM +1000, Damian Conway wrote: > > : But that's only the opinion of one(@Larry), not of $Larry. > > > > Let's go 0-based and make $0 =:= $/[0] so that $/[] is all the parens. > > Our old $0 (P5's $&) could be $<> instead, short for $ or some > > such. > > Both 0-based $0 etc and $<> are now implemented in Pugs. $0 is also in PGE, $<> will wait for a more definitive statement. > > I wonder how much call there will be for a rule option that uses P6 > > syntax but P5 paren binding with "push" semantics. > > Should it be an rule option, or simply an alternate way to address > the content in $/? Something like $/.flattened_matches[10], perhaps? Could be, although I'd prefer $/.lparens or something like that, to make it clear we're counting left parens. "Flattened_matches" sounds to me like we're flattening all of the captures into a single list. But then again, "lparens" looks odd when presented with something like: / $7:=[\s+] / since there really aren't any lparens there. Maybe $/.perl5[10] . :-) To make things work this way, PGE will have to tag match objects along both the nested and non-nested lexical scopes of the rule pattern. Not too big an issue--just more bookkeeping to keep track of. OTOH, since :perl5 is already flattening its paren captures, there's precedent for doing this sort of thing as a rule option. It may make more sense to request "flatness" within the rule as opposed to (in addition to?) the returned match object. Pm
Re: Nested captures
H, On Wed, May 11, 2005 at 08:30:42AM -0700, Larry Wall wrote: On Wed, May 11, 2005 at 05:48:59PM +1000, Damian Conway wrote: : But that's only the opinion of one(@Larry), not of $Larry. Let's go 0-based and make $0 =:= $/[0] so that $/[] is all the parens. Our old $0 (P5's $&) could be $<> instead, short for $ or some such. Both 0-based $0 etc and $<> are now implemented in Pugs. Does anybody check if he actually implements what he say? :))) It just doesn't seem possible... :) Autrijus, I assume you started Pugs by creating Time::Machine first... BTW, anybody interested in creating and FastCGI interface for Pugs? - Fagzal
Re: Nested captures
On Wed, May 11, 2005 at 08:30:42AM -0700, Larry Wall wrote: > On Wed, May 11, 2005 at 05:48:59PM +1000, Damian Conway wrote: > : But that's only the opinion of one(@Larry), not of $Larry. > > Let's go 0-based and make $0 =:= $/[0] so that $/[] is all the parens. > Our old $0 (P5's $&) could be $<> instead, short for $ or some > such. Both 0-based $0 etc and $<> are now implemented in Pugs. > It's already the case that p5-to-p6 is going to have a *wonderful* > time translating $7 to $1[2][0]... > > I wonder how much call there will be for a rule option that uses P6 > syntax but P5 paren binding with "push" semantics. Should it be an rule option, or simply an alternate way to address the content in $/? Something like $/.flattened_matches[10], perhaps? Thanks, /Autrijus/ pgpLRMxwq6JSf.pgp Description: PGP signature
Re: Nested captures
On Wed, May 11, 2005 at 05:48:59PM +1000, Damian Conway wrote: : But that's only the opinion of one(@Larry), not of $Larry. Let's go 0-based and make $0 =:= $/[0] so that $/[] is all the parens. Our old $0 (P5's $&) could be $<> instead, short for $ or some such. It's already the case that p5-to-p6 is going to have a *wonderful* time translating $7 to $1[2][0]... I wonder how much call there will be for a rule option that uses P6 syntax but P5 paren binding with "push" semantics. Larry
Re: Nested captures
> "DC" == Damian Conway <[EMAIL PROTECTED]> writes: DC> Uri Guttman wrote: DC> Sure. Just as $42 is a shorthand for $/[42], so too $ is a DC> shorthand for $/. >> but then what about the different index bases for $42 and $/[42]? i >> don't think that has been resolved (nor has mixing the $1.1 and $1[1] >> syntaxes). DC> Bear in mind that that reply was posted in haste, late at night, after DC> a long day of teaching. We're lucky it as only off by one! %-) DC> But it does raise an important point: the discrepancy between $42 and DC> $/[41] *is* a great opportunity for off-by-on errors. Previously, DC> however, @Larry have tossed back and forth the possibility of using $0 DC> as the first capture variable so that the indices of $/[0], $/[1], DC> $/[2] match up with the "names" of $0, $1, $2, etc. DC> I think this error--unintentional, I swear!--argues strongly that DC> internal consistency within Perl 6 is more important than historical DC> consistency with Perl 5's $1, $2, $3... i would like them to be consistant too. you could also make $/[1] be the same as $1 and not use $/[0] for a regular grab. then $0 and $/[0] could be used for something special. but just 0 basing them both is fine with me. the key is to align them. we all seem to agree this is a massive off by 1 error waiting to happen. we still haven't seen what @larry has to say about mixing $1[$j] and $1.1 syntaxes (let's assume they both use the same index base). uri -- Uri Guttman -- [EMAIL PROTECTED] http://www.stemsystems.com --Perl Consulting, Stem Development, Systems Architecture, Design and Coding- Search or Offer Perl Jobs http://jobs.perl.org
Re: Nested captures
On Wed, May 11, 2005 at 05:48:59PM +1000, Damian Conway wrote: > Uri Guttman wrote: > > > DC> Sure. Just as $42 is a shorthand for $/[42], so too $ is a > > DC> shorthand for $/. > > > >but then what about the different index bases for $42 and $/[42]? i > >don't think that has been resolved (nor has mixing the $1.1 and $1[1] > >syntaxes). > > Bear in mind that that reply was posted in haste, late at night, after a > long day of teaching. We're lucky it as only off by one! %-) > > But it does raise an important point: the discrepancy between $42 and > $/[41] *is* a great opportunity for off-by-on errors. Indeed. > Previously, however, > @Larry have tossed back and forth the possibility of using $0 as the first > capture variable so that the indices of $/[0], $/[1], $/[2] match up with > the "names" of $0, $1, $2, etc. > > I think this error--unintentional, I swear!--argues strongly that internal > consistency within Perl 6 is more important than historical consistency > with Perl 5's $1, $2, $3... > > But that's only the opinion of one(@Larry), not of $Larry. My opinion too. The $ vars should be zero-based just as the array indices are. (or, if we can come up with a plausible meaning for the zeroth index of our match arrays, keep starting at $1) -Scott -- Jonathan Scott Duff [EMAIL PROTECTED]
Re: Of fail, exceptions and catching
On Wed, 2005-05-11 at 09:50, Luke Palmer wrote: > Oh, just to avoid further confusion: In the baz() called under fatal, > it will only turn undefs that were generated by "fail" calls into > exceptions. Other sorts of undefs will be returned as ordinary > undefs. Ok, so let me try to get my head around this: fail is something like: return undef but Exception(...some state info...); the only question is whether the caller reacts to that special return value like so: if $return ~~ Exception { return $return; } or simply ignores it. If you ignore the special return value, then you presumably have the burden of coping with it in some other way, like so: no fatal; $socket.bind(:interface, :port<80>) or $socket.bind([EMAIL PROTECTED]) or die "Cannot bind: $!"; At the top-level (runtime), you would expect to have something like (arm-waving some naming specifics): given $program.(@args) { when Exception { $*ERR.print $_.err; exit 1 } default { exit +$_ } } Am I getting it now? -- Aaron Sherman <[EMAIL PROTECTED]> Senior Systems Engineer and Toolsmith "It's the sound of a satellite saying, 'get me down!'" -Shriekback
Re: Nested captures
> > But it does raise an important point: the discrepancy between $42 and $/[41] > > *is* a great opportunity for off-by-on errors. Previously, however, @Larry > > have tossed back and forth the possibility of using $0 as the first capture > > variable so that the indices of $/[0], $/[1], $/[2] match up with the > > "names" > > of $0, $1, $2, etc. > > > > I think this error--unintentional, I swear!--argues strongly that internal > > consistency within Perl 6 is more important than historical consistency with > > Perl 5's $1, $2, $3... FWIW, I think that all the /^\$\d+$/ variables should be related to each other, too. Now - here's a question. Can I always address $42 in same way that I could address $2 in P5 at any time? Or, will they only come into scope whenever there was a match higher up? I personally like them only being in scope within the scope of a match, especially under any strictures. Second - is it possible or desirable for @/ to be assignable? I can think of some nice uses for that, primarily in testing ... Rob
Re: Of fail, exceptions and catching
On 5/11/05, Luke Palmer <[EMAIL PROTECTED]> wrote: > sub foo() { > fail; > } > > use fatal; > sub bar() { > foo(); # foo() throws exception > } > > no fatal; > sub baz() { > foo(); # foo() returns undef > } > > use fatal; > bar(); # propagates exception from foo() > baz(); # turns baz()'s (from foo()'s) undef into an exception > > no fatal; > bar(); # turns exception thrown from foo()'s into an undef > baz(); # returns the undef that it got from foo() Oh, just to avoid further confusion: In the baz() called under fatal, it will only turn undefs that were generated by "fail" calls into exceptions. Other sorts of undefs will be returned as ordinary undefs. Likewise, in the bar() called under no fatal, it will only turn exceptions that were generated by "fail" calls into undefs. Other sorts of exceptions stay as exceptions and propagate outward. Luke
Re: Of fail, exceptions and catching
On 5/11/05, Aaron Sherman <[EMAIL PROTECTED]> wrote: > Given: > > "fail" with configurable behavior > "no fatal" to make "fail" just warn Juerd is right here, it doesn't warn. Instead of "die"ing, it returns an undef with some helpful diagnostic information (an "unthrown exception" as Larry has been calling it). > "use fatal" to make "fail" throw exceptions > > A question came up on #perl6 for the following code: > > no fatal; > class Foo { > use fatal; > method bar() { fail; } > } > Foo.bar; > > That is, bar() picks up a lexically scoped "use fatal" No it doesn't. The fatal that refers to bar's return value belongs to the caller of bar. Here's another example: sub foo() { fail; } use fatal; sub bar() { foo(); # foo() throws exception } no fatal; sub baz() { foo(); # foo() returns undef } use fatal; bar(); # propagates exception from foo() baz(); # turns baz()'s (from foo()'s) undef into an exception no fatal; bar(); # turns exception thrown from foo()'s into an undef baz(); # returns the undef that it got from foo() Does that clarify things? (I could tell there was some misunderstanding going on, but I had a hard time explaining it. Hopefully this example will clear things up) Luke
Re: Of fail, exceptions and catching
Aaron Sherman skribis 2005-05-11 7:44 (-0400): > "no fatal" to make "fail" just warn I thought it wouldn't warn, but instead silently return undef (an unthrown exception). Juerd -- http://convolution.nl/maak_juerd_blij.html http://convolution.nl/make_juerd_happy.html http://convolution.nl/gajigu_juerd_n.html
Of fail, exceptions and catching
Given: "fail" with configurable behavior "no fatal" to make "fail" just warn "use fatal" to make "fail" throw exceptions A question came up on #perl6 for the following code: no fatal; class Foo { use fatal; method bar() { fail; } } Foo.bar; That is, bar() picks up a lexically scoped "use fatal", but the caller of bar desires non-fatal behavior. Possible results and reasons for them: * Warning -- For this to work, I think you would need an implicit CATCH block wrapper around each statement in a "no fatal" section. This could be simulated by wrapping the entire no fatal section in a continuation closure which is then called inside a CATCH block which warns and re-invokes the continuation... if continuations can be re-invoked after exceptions. That's an interesting question on it's own actually. * Fatal exception -- This implies that the callee's lexically scoped fatal preference wins, which might be a reasonable thing in concept, but seems to present no control to the user of a module when the module author has specified a preference. Perhaps that's a good thing. Perhaps it's bad. I do see a problem with a program that uses several modules, all of which have different opinions about fatality... * Warning -- Another reason that you might get a warning would be that there is some sort of dynamically scoped imposition of a non-fatal context from "no fatal". That seems the least likely to me, but I thought that I'd bring it up. This came up when someone was considering writing a module, and was saying that he really wanted "use fatal" as the default for his modules. That seems like a reasonable thing to want, but I'm not sure how it could be controlled correctly. -- Aaron Sherman <[EMAIL PROTECTED]> Senior Systems Engineer and Toolsmith "It's the sound of a satellite saying, 'get me down!'" -Shriekback
Re: Nested captures
On Wed, 2005-05-11 at 17:48 +1000, Damian Conway wrote: > But it does raise an important point: the discrepancy between $42 and $/[41] > *is* a great opportunity for off-by-on errors. Previously, however, @Larry > have tossed back and forth the possibility of using $0 as the first capture > variable so that the indices of $/[0], $/[1], $/[2] match up with the "names" > of $0, $1, $2, etc. > > I think this error--unintentional, I swear!--argues strongly that internal > consistency within Perl 6 is more important than historical consistency with > Perl 5's $1, $2, $3... I've run across many people who have been quite confused by the fact that Perl 5's $n submatch holders aren't 0-indexed, so I don't think this would be a bad change at all. Certainly it should go high in the "what changed" list as a warning, but that's not a big deal.
Re: Nested captures
Damian Conway wrote: I think this error--unintentional, I swear!--argues strongly that internal consistency within Perl 6 is more important than historical consistency with Perl 5's $1, $2, $3... But that's only the opinion of one(@Larry), not of $Larry. My opinion as none(@Larry), too. And correct me if I'm wrong, but Perl6 is a major version change, isn't it? OTOH I don't have any legacy code to support :) -- TSa (Thomas Sandlaß)
Re: Nested captures
Uri Guttman wrote: DC> Sure. Just as $42 is a shorthand for $/[42], so too $ is a DC> shorthand for $/. but then what about the different index bases for $42 and $/[42]? i don't think that has been resolved (nor has mixing the $1.1 and $1[1] syntaxes). Bear in mind that that reply was posted in haste, late at night, after a long day of teaching. We're lucky it as only off by one! %-) But it does raise an important point: the discrepancy between $42 and $/[41] *is* a great opportunity for off-by-on errors. Previously, however, @Larry have tossed back and forth the possibility of using $0 as the first capture variable so that the indices of $/[0], $/[1], $/[2] match up with the "names" of $0, $1, $2, etc. I think this error--unintentional, I swear!--argues strongly that internal consistency within Perl 6 is more important than historical consistency with Perl 5's $1, $2, $3... But that's only the opinion of one(@Larry), not of $Larry. Damian