Re: Naming debate- what's the location for it?
Just Mu would be an amusing Perlish pun based on Muttsu... Making the interpretation either Perl "six" or Perl "most undefined". I like yary's idea too. Frankly, if Perl had an identity, I would not care about the name. I feel like it lacks that right now. -- Aaron Sherman, M.: P: 617-440-4332 // E: a...@ajs.com Toolsmith, developer, gamer and life-long student. On Thu, Feb 8, 2018 at 3:50 PM, Brent Laabs wrote: > Thanks for the summary of the high points, as there were a large number of > low points in previous discussions. > > Roku is not the only reading for 六 in Japanese, the kun reading is > muttsu. So we could become Mupperl. What's the counter word for computer > languages, anyway? > > > > On Thu, Feb 8, 2018 at 12:15 PM, Aaron Sherman wrote: > >> I think this is a fine place, personally. Past discussions have included >> these high points as I recall them: >> >> >>1. Perl is definitely the family name >>2. Rakudo started out as the name of an implementation, but started >>to wander into being the name of the specific leaf in the family tree >>3. Problem is that that leaves us uncertain of the status of >>non-Rakudo-the-implementation implementations. Are they now Rakudo too? >>That's confusing at best. >> >> >> IMHO, 6 has always been the personal name, but it could be changed to >> something that's "sixish" without being an explicit number. Normally, I'd >> recommend Latin, but Perl Sex is probably not where anyone wants to go... >> Roku is Japanese, but also the name of a popular device, and thus >> confusing... >> >> >> >> >> >> >> -- >> Aaron Sherman, M.: >> P: 617-440-4332 <(617)%20440-4332> // E: a...@ajs.com >> Toolsmith, developer, gamer and life-long student. >> >> On Thu, Feb 8, 2018 at 10:41 AM, yary wrote: >> >>> I recall coming across a post saying the Perl6 name is up for discussion >>> - searched & found this post now https://6lang.party/post/The-H >>> ot-New-Language-Named-Rakudo describes it. Is there a forum where the >>> name's being discussed that I can read? >>> >>> Woke up this morning with a name proposal that seemed to have a lot >>> going for it, but from that post it seems Lizmat et al have a good choice >>> already & I don't want to add to bikeshedding... wondering what the >>> thinking is right now. >>> >>> -y >>> >> >> >
Re: Naming debate- what's the location for it?
I think this is a fine place, personally. Past discussions have included these high points as I recall them: 1. Perl is definitely the family name 2. Rakudo started out as the name of an implementation, but started to wander into being the name of the specific leaf in the family tree 3. Problem is that that leaves us uncertain of the status of non-Rakudo-the-implementation implementations. Are they now Rakudo too? That's confusing at best. IMHO, 6 has always been the personal name, but it could be changed to something that's "sixish" without being an explicit number. Normally, I'd recommend Latin, but Perl Sex is probably not where anyone wants to go... Roku is Japanese, but also the name of a popular device, and thus confusing... -- Aaron Sherman, M.: P: 617-440-4332 // E: a...@ajs.com Toolsmith, developer, gamer and life-long student. On Thu, Feb 8, 2018 at 10:41 AM, yary wrote: > I recall coming across a post saying the Perl6 name is up for discussion - > searched & found this post now https://6lang.party/post/The- > Hot-New-Language-Named-Rakudo describes it. Is there a forum where the > name's being discussed that I can read? > > Woke up this morning with a name proposal that seemed to have a lot going > for it, but from that post it seems Lizmat et al have a good choice already > & I don't want to add to bikeshedding... wondering what the thinking is > right now. > > -y >
Re: CALL-ME vs. Callable
I guess I wasn't clear in what I was asking: What, exactly, was it that NQP was doing? What were the inputs and what was the behavior that you observed? So far, all I have to go on is one example that you feel is not illustrating the broken behavior of NQP that you want to work around with a change to the way Callable and calling work. I'm not suggesting that the latter is bad, but it seems to be a patch around a problem in the former... Aaron Sherman, M.: P: 617-440-4332 Google Talk, Email and Google Plus: a...@ajs.com Toolsmith, developer, gamer and life-long student. On Mon, Nov 14, 2016 at 4:32 PM, Brandon Allbery wrote: > > On Mon, Nov 14, 2016 at 4:28 PM, Aaron Sherman wrote: > >> So, you said that the problem arises because NQP does something >> non-obvious that results in this error. Can you be clear on what that >> non-obvious behavior is? It sounds to me like you're addressing a symptom >> of a systemic issue. > > > That's pretty much the definition of LTA. The programmer did something > that on some level involves a call (in the simple example it was explicit, > but there are some implicit ones in the language), and got a runtime error > referencing an internal name instead of something preferably compile time > related to what they wrote. The fix for this is to abstract it into a role > that describes "calling"/"invoking" instead of having a CALL-ME that the > user didn't (and probably shouldn't) define suddenly pop up out of nowhere. > That isn't the part that's difficult, aside from "so why wasn't it done > that way to begin with?". > > -- > brandon s allbery kf8nh sine nomine > associates > allber...@gmail.com > ballb...@sinenomine.net > unix, openafs, kerberos, infrastructure, xmonad > http://sinenomine.net >
Re: CALL-ME vs. Callable
So, you said that the problem arises because NQP does something non-obvious that results in this error. Can you be clear on what that non-obvious behavior is? It sounds to me like you're addressing a symptom of a systemic issue. Aaron Sherman, M.: P: 617-440-4332 Google Talk, Email and Google Plus: a...@ajs.com Toolsmith, developer, gamer and life-long student. On Mon, Nov 14, 2016 at 4:08 PM, Brandon Allbery wrote: > > On Mon, Nov 14, 2016 at 3:42 PM, Aaron Sherman wrote: > >> I do think, though that if the concern is really with "the 4 cases when >> nqp hauls a CALL-ME out of its bowels" then that's what should be >> addressed... >> > > The main addressing of that is some kind of role to abstract it properly. > I just think the current situation is bad and even if we come up with a > name for the new role, it's still going to be confusing ("ok, why do we > have both Callable and Invokable? ...uh wait, Callable means *what* > exactly?"). > > > -- > brandon s allbery kf8nh sine nomine > associates > allber...@gmail.com > ballb...@sinenomine.net > unix, openafs, kerberos, infrastructure, xmonad > http://sinenomine.net >
Re: CALL-ME vs. Callable
Fair points, all. I do think, though that if the concern is really with "the 4 cases when nqp hauls a CALL-ME out of its bowels" then that's what should be addressed... Aaron Sherman, M.: P: 617-440-4332 Google Talk, Email and Google Plus: a...@ajs.com Toolsmith, developer, gamer and life-long student. On Mon, Nov 14, 2016 at 3:22 PM, Brandon Allbery wrote: > Also... > > On Mon, Nov 14, 2016 at 3:06 PM, Aaron Sherman wrote: > >> Role-based testing seems very perl6ish. I'd suggest the role name be >> "Invocable" with much the sort of signature as you've described. > > > If it's Invokable then the method should probably be INVOKE. It still > leaves the question of why Callable appears to be the only role named > after what it applies to instead of what it provides. > > -- > brandon s allbery kf8nh sine nomine > associates > allber...@gmail.com > ballb...@sinenomine.net > unix, openafs, kerberos, infrastructure, xmonad > http://sinenomine.net >
Re: CALL-ME vs. Callable
Role-based testing seems very perl6ish. I'd suggest the role name be "Invocable" with much the sort of signature as you've described. That being said, I don't think that the current error is terrible. It clearly shows that the issue is with the attempt to invoke a Bool. Aaron Sherman, M.: P: 617-440-4332 Google Talk, Email and Google Plus: a...@ajs.com Toolsmith, developer, gamer and life-long student. On Mon, Nov 14, 2016 at 2:32 PM, Brandon Allbery wrote: > This started out some weeks ago as a user in #perl6 confused by an error > that gofled down to: > > [28 19:01:37] m: my Bool $x = False; $x() > [28 19:01:38] rakudo-moar 0dc6f7: OUTPUT«No such method > 'CALL-ME' for invocant of type 'Bool' > > This is, at the very least, LTA. But it also got me thinking about the > spec. > > The obvious way to clean up the above, and some more complicated things of > the same sort, is to hide the obviously-internal CALL-ME behind a role, > which arguably should be part of the spec as well to let implementations > provide a stable interface while doing whatever is necessary for the > backend in its implementation. The obvious name for this role is Callable. > But, we already have a role Callable which is a mixin that adds > functionality to things that are callable, and is part of 6.c's spec. This > seems like a design flaw, and looks like it's going to be hard to fix, or > at least to make rational when other roles describe things that *are* > something instead of describing what they *modify*. > > Is there a way forward on this that makes any sense? > > (for the record, the proposed wrapper role looks something like > > role OughtToBeCallable { # bikeshed pending paint job... > method CALL(|c) {...}; # CALL-ME happens here > } > > and the 4 places that currently build a CALL-ME in nqp would have a type > OughtToBeCallable added so that the runtime CALL-ME error becomes a > compile-time failed OughtToBeCallable constraint with a > user-comprehensible error.) > > -- > brandon s allbery kf8nh sine nomine > associates > allber...@gmail.com > ballb...@sinenomine.net > unix, openafs, kerberos, infrastructure, xmonad > http://sinenomine.net >
Re: Is this a bug?
Thank you. Silly me, thinking "this is so simple I don't need to run it through the command-line to test it." :-) Anway, yeah, say $_ for reverse lines Aaron Sherman, M.: P: 617-440-4332 Google Talk, Email and Google Plus: a...@ajs.com Toolsmith, developer, gamer and life-long student. On Mon, Sep 19, 2016 at 10:10 AM, Timo Paulssen wrote: > On 19/09/16 16:02, Aaron Sherman wrote: > > I'm guessing that what you meant was "say as a function was what I > > meant to use there." In which case: > > say for reverse lines > > or > > > for reverse lines { say } > > These are both valid ways of asking for each > element of the iterable > thing returned from lines to be printed with a > newline. > Watch out, this needs to read say $_ otherwise you would get an error > message: > > Unsupported use of bare "say"; in Perl 6 please use .say if you meant $_, > or use an explicit invocant or argument, or use &say to refer to the > function as a noun > >
Re: Is this a bug?
I'm guessing that what you meant was "say as a function was what I meant to use there." In which case: say for reverse lines or for reverse lines { say } These are both valid ways of asking for each element of the iterable thing returned from lines to be printed with a newline. But remember that any {} around code creates a Block in Perl 6, and a Block is a first-class object. If you ask say to print a Block, it will quite happily try to do that. Aaron Sherman, M.: P: 617-440-4332 Google Talk, Email and Google Plus: a...@ajs.com Toolsmith, developer, gamer and life-long student. On Sun, Sep 18, 2016 at 4:49 PM, Parrot Raiser <1parr...@gmail.com> wrote: > say { $_ } was the correct thing to use there. (I'm trying to avoid > any mention of O-O for the moment.) > say {} was a "what happens if I do this" exercise. > > What is this -> ;; $_? is raw { #`(Block|170303864) … } output? > > On 9/18/16, Brent Laabs wrote: > > Remember you can call a block with parentheses: > > > >> say { 11 + 31 }; > > -> ;; $_? is raw { #`(Block|140268472711224) ... } > >> say { 11 + 31 }(); > > 42 > > > > > > On Sun, Sep 18, 2016 at 12:58 PM, Elizabeth Mattijsen > > wrote: > > > >> I think you want: > >> > >> .say for reverse lines; > >> > >> not sure what you are trying to achieve otherwise, but: > >> > >>say { } > >> > >> producing something like > >> > >>-> ;; $_? is raw { #`(Block|170303864) … } > >> > >> feels entirely correct to me. :-) > >> > >> > >> Liz > >> > >> > On 18 Sep 2016, at 21:52, Parrot Raiser <1parr...@gmail.com> wrote: > >> > > >> > This code: > >> > 1 #! /home/guru/bin/perl6 > >> > 2 > >> > 3 # Ask for some lines and output them in reverse > >> > 4 # Work out the appropriate EOF symbol for the OS > >> > 5 > >> > 6 my $EOF = "CTRL-" ~ ($*DISTRO.is-win ?? "Z" !! "D"); > >> > 7 > >> > 8 say "Please enter some lines and end them with $EOF"; > >> > 9 > >> > 10 say { for reverse lines() {} }; > >> > 11 > >> > 12 # End > >> > produces this: > >> > Please enter some lines and end them with CTRL-D# obviously from > >> line 8 > >> > -> ;; $_? is raw { #`(Block|170303864) ... }# but > >> this? > >> > >> > > >
Re: This seems to be wrong
"for @inputs.map( .prefix:<+> ) {...}" That's spelled: "for @inputs>>.Int -> $i { ... }" You can also use map, but it's slightly clunkier: "for @inputs.map: .Int -> $i { ... }" Aaron Sherman, M.: P: 617-440-4332 Google Talk, Email and Google Plus: a...@ajs.com Toolsmith, developer, gamer and life-long student. On Sun, Sep 18, 2016 at 6:06 PM, Trey Harris wrote: > Why does this: > > for @inputs.map( .prefix:<+> ) { ... } > > Not work? It results inMethod 'prefix:<+>' not found for invocant of > class 'Any', but the docs > <https://docs.perl6.org/language/operators#prefix_+> say it is defined as > a multi on Any…. > > Trey > > > On Sun, Sep 18, 2016 at 4:37 PM Brandon Allbery > wrote: > >> On Sun, Sep 18, 2016 at 4:31 PM, Parrot Raiser <1parr...@gmail.com> >> wrote: >> >>> but seems to have a problem with larger numbers: >>> >>> 7 >>> 3 >>> 21 <- This >>> 2 >>> 1 >>> 0 >>> 4 >>> bamm-bamm >>> barney >>> (Any) <--- Produces this >>> betty >>> fred >>> 0 out of range 1..7 >>> dino >>> >> >> [18 20:35] m: say so "21" ~~ 1..7 >> [18 20:35] rakudo-moar 34f950: OUTPUT«True >> [18 20:35] » >> >> It came from lines(), it is a Str. Numify it first. >> >> -- >> brandon s allbery kf8nh sine nomine >> associates >> allber...@gmail.com >> ballb...@sinenomine.net >> unix, openafs, kerberos, infrastructure, xmonad >> http://sinenomine.net >> >
Re: Justification for the "reversed" instruction format
In Perl 6, we apply those constraints when you pass off the call to the ultimate recipient, and that's important because that recipient can have multiple signatures (and signatures can be added *after* you define the flip). For example: $ cat foo.p6 sub flip(&f) { -> $b, $a, |c { f($a, $b, |c) } } multi sub counter(Int $start, Int $end, :$by=1) { $start, *+$by ... $end } multi sub counter(Str $start, Str $end, :$by=1) { ($start.ord, *+$by ... $end.ord).map: {.chr} } my &flip-counter = flip &counter; say flip-counter 10, 2, :by(2); say flip-counter 'q', 'k', :by(2); say flip-counter 3.7, 1, :by(2); $ perl6 foo.p6 (2 4 6 8 10) (k m o q) Cannot resolve caller counter(Int, Rat, Int); none of these signatures match: (Int $start, Int $end, :$by = 1) (Str $start, Str $end, :$by = 1) in block at foo.p6 line 3 Aaron Sherman, M.: P: 617-440-4332 Google Talk, Email and Google Plus: a...@ajs.com Toolsmith, developer, gamer and life-long student. On Thu, Sep 8, 2016 at 2:41 PM, Trey Harris wrote: > On Thu, Sep 8, 2016 at 9:23 AM Aaron Sherman a...@ajs.com > <http://mailto:a...@ajs.com> wrote: > > I don't know Haskell, but isn't flip just: >> >> sub flip(&f) { -> $b, $a, |c { f($a, $b, |c) } } >> >> And then: >> >> perl6 -e 'sub flip(&f) { -> $a, $b, |c { f($b, $a, |c) } }; my &yas = >> flip &say; yas(1,2,3)' >> 213 >> >> Yes—but my worry about that was that it wouldn’t carry over signature > constraints and coercions as a specific argument-flipper sub written with > the same signature (flipped). Haskell does deep type inference, unlike Perl > 6, so simply writing flip f x y = f y x is sufficient to get exactly the > same behavior in constrained or coercive contexts. > > >> Aaron Sherman, M.: >> P: 617-440-4332 Google Talk, Email and Google Plus: a...@ajs.com >> Toolsmith, developer, gamer and life-long student. >> >> >> On Wed, Sep 7, 2016 at 6:13 PM, Brandon Allbery >> wrote: >> >>> On Wed, Sep 7, 2016 at 6:08 PM, Parrot Raiser <1parr...@gmail.com> >>> wrote: >>> >>>> There is a "flip" in P6, to reverse the characters in a string, and a >>>> "reverse", to return the elements of a list. Would either of those be >>>> an equivalent? >>>> >>> >>> Not without an "apply" mechanism used for function / method / operator >>> invocations. Which is almost viable in Perl 6 since the parameters get >>> passed as a list --- except that the list is only visible within the >>> implementation, not at the call site (which is what "apply" does). >>> >>> -- >>> brandon s allbery kf8nh sine nomine >>> associates >>> allber...@gmail.com >>> ballb...@sinenomine.net >>> unix, openafs, kerberos, infrastructure, xmonad >>> http://sinenomine.net >>> >> >> >
The use of $!attr vs self.attr in core libraries
In working with Range a while back, I was frustrated to find that writing a subclass that wanted to override an accessor (e.g. for $.min and $.max) was quite difficult because most methods ignored the accessors and called $!min and $!max or wrote to them directly. If I really wanted to change the semantics, I pretty much had to re-write or at least wrap every existing method, which would make my module nigh unmaintainable. Is this for performance reasons? Would it make sense to try to find a way to make this easier? As a concrete example, here's how "elems" is defined on Range: https://github.com/rakudo/rakudo/blob/32902f25ca753860067a34eb9741aa5524dbe64e/src/core/Range.pm#L96 -- Aaron Sherman, M.: P: 617-440-4332 Google Talk, Email and Google Plus: a...@ajs.com Toolsmith, developer, gamer and life-long student.
Re: Justification for the "reversed" instruction format
I don't know Haskell, but isn't flip just: sub flip(&f) { -> $b, $a, |c { f($a, $b, |c) } } And then: perl6 -e 'sub flip(&f) { -> $a, $b, |c { f($b, $a, |c) } }; my &yas = flip &say; yas(1,2,3)' 213 Aaron Sherman, M.: P: 617-440-4332 Google Talk, Email and Google Plus: a...@ajs.com Toolsmith, developer, gamer and life-long student. On Wed, Sep 7, 2016 at 6:13 PM, Brandon Allbery wrote: > On Wed, Sep 7, 2016 at 6:08 PM, Parrot Raiser <1parr...@gmail.com> wrote: > >> There is a "flip" in P6, to reverse the characters in a string, and a >> "reverse", to return the elements of a list. Would either of those be >> an equivalent? >> > > Not without an "apply" mechanism used for function / method / operator > invocations. Which is almost viable in Perl 6 since the parameters get > passed as a list --- except that the list is only visible within the > implementation, not at the call site (which is what "apply" does). > > -- > brandon s allbery kf8nh sine nomine > associates > allber...@gmail.com > ballb...@sinenomine.net > unix, openafs, kerberos, infrastructure, xmonad > http://sinenomine.net >
Re: Justification for the "reversed" instruction format
Oh, and note that you can pass R'd reductions as if they were normal prefix ops: $ perl6 -e 'sub dueet(&op, *@list) { op @list }; say dueet &prefix:<[R-]>, 1..100' -4850 On Tue, Sep 6, 2016 at 12:51 PM, Aaron Sherman wrote: > > > $ perl6 -e 'my @numbers = 1..100; say [-] @numbers; say [R-] @numbers' > -5048 > -4850 > > In general, it's kind of pointless with bare infix ops, as you can just > reverse the arguments, but when reducing or the like, it becomes much more > valuable. > > > > On Tue, Sep 6, 2016 at 12:43 PM, Parrot Raiser <1parr...@gmail.com> wrote: > >> I've just stumbled across "reversed operators", e.g. say 4 R/ 12; # 3 >> in the documentation. I'm curious to know why the language includes >> them? I'm having trouble understanding where they would be useful. >> > >
Re: Justification for the "reversed" instruction format
$ perl6 -e 'my @numbers = 1..100; say [-] @numbers; say [R-] @numbers' -5048 -4850 In general, it's kind of pointless with bare infix ops, as you can just reverse the arguments, but when reducing or the like, it becomes much more valuable. On Tue, Sep 6, 2016 at 12:43 PM, Parrot Raiser <1parr...@gmail.com> wrote: > I've just stumbled across "reversed operators", e.g. say 4 R/ 12; # 3 > in the documentation. I'm curious to know why the language includes > them? I'm having trouble understanding where they would be useful. >
Update for S32::Str and musings on sprintf
In the documentation for sprintf ( http://perlcabal.org/syn/S32/Str.html#sprintf) I suggest changing: The $format is scanned for % characters. Any % introduces a format token. > Format tokens have the following grammar: to: The $format is scanned for % characters. Any % introduces a format token. > The simplest format tokens are a % followed by a letter which is called a > "directive". Directives guide the use (if any) of the rest of sprintf's > arguments. Between the % and the directive, a number of controls such as > precision and width formatting can be introduced. > In detail, format tokens have the following grammar: ... and remove the two sentences that follow the grammar itself. It's a bit harsh to drop the casual reader right into the grammar for format tokens without explaining what the grammar is for. As for the grammar itself, can we remove the commit after the initial %? Is it required for the documentation? On "index" modifiers: Do we need a string variant for named parameters in P6? E.g.: sprintf('%(time)$s: %(msg)$s', :$msg, :$time); as an alternative to: sprintf('%2$s: %1$s', $msg, $time); Python has something like this ( http://docs.python.org/library/stdtypes.html#string-formatting), but they drop the $ and just use "%(name)d" where d is the directive... it's not clear to me if the presence of the $ in P6 helps (because it maintains visual compatibility with numeric indexes) or hinders (because it's not required and therefore just makes the format longer for no reason)... Also documenting each part of the grammar is, of course, required. For example, here's a description of the "index" portion of the grammar: The index modifies the default behavior of pulling needed values from > sprintf's remaining arguments, in order. Instead, the given index number > addresses the positional argument list to sprintf, with the format string > itself being index zero and subsequent positional parameters being one and > so-forth. In that description, I'm implicitly allowing for a format string which includes itself ("%0$s"), which is not part of the previous standards, but I think it doesn't hurt, and avoids arbitrarily disallowing it the way P5 and POSIX do... The vector flag should probably go. Its only primary use is unpacking raw IPv4 addresses, and that can be done in a dozen other trivial ways in Perl 6. TMTOWTDI, but this is a particularly archaic way that I can't imagine any new code wanting to use... For the rest, I think we can copy the documentation for precision and flags from: http://perldoc.perl.org/functions/sprintf.html Also, a review of the POSIX documentation might reveal additional items that should be documented: http://pubs.opengroup.org/onlinepubs/9699919799/functions/sprintf.html -- Aaron Sherman P: 617-440-4332 Google Talk: a...@ajs.com / aaronjsher...@gmail.com "Toolsmith" and developer. Player of games. Buyer of gadgets.
lol context and X
[This might be better suited to p6c if it turns out that this is a bug, but I'll assume it's not to start...] So, last night sorear said, "I might write the loop as for 2 .. $lim X 2 .. $lim -> $a, $b {" I played around with this a bit, and I'm unclear on how this works. Here's some examples that I tried: # parens on the arglist causes flattening? $ ../rakudo/perl6 -e 'for 1 .. 2 X 4 .. 5 -> ($a, $b) { say $a.perl, $b.perl }' Not enough positional parameters passed; got 0 but expected 2 in sub-signature in at line 1 in main program body at line 1 # Lack of parens gives lol context? $ ../rakudo/perl6 -e 'for 1 .. 2 X 4 .. 5 -> $a, $b { say $a.perl, $b.perl }' 14 15 24 25 # Default context is flat? $ ../rakudo/perl6 -e 'for 1 .. 2 X 4 .. 5 { say .perl }' 1 4 1 5 2 4 2 5 # lol just not implemented in rakudo, or am I doing it wrong? $ ../rakudo/perl6 -e 'for lol 1 .. 2 X 4 .. 5 { say .perl }' Could not find sub &lol in main program body at line 1 # capture flattens? that seems really non-intuitive $ ../rakudo/perl6 -e 'for |(1 .. 2 X 4 .. 5) { say .perl }' \(1, 4, 1, 5, 2, 4, 2, 5) # hyper-. flattens? $ ../rakudo/perl6 -e '(1 .. 2 X 4 .. 5)>>.join(",").say' 14152425 Can someone explain why these all behave so differently, and why we chose to flatten so aggressively in so many cases, but not in some others? -- Aaron Sherman P: 617-440-4332 Google Talk: a...@ajs.com / aaronjsher...@gmail.com "Toolsmith" and developer. Player of games. Buyer of gadgets.
Return value of try
I was listening to the recent IO conversation on p6c, and decided to look at IO.pm in rakudo. I immediately saw a bit of code that worried me: try { ?$!PIO.close() } $! ?? fail($!) !! Bool::True Why is that so cumbersome? That seems like one of the most obvious use-cases for exceptions. I looked over the synopses and tested some theories, but what I came to was inconclusive. It seems like: return try { die "oops" } has undefined behavior because the return value of try is not clearly spelled out (one might expect it to be the return value of the statement, but that's not said anywhere, though it's backhandedly implied by an example where try is used like do). Right now, the above causes a Null PMC access in Rakudo, but that might just be a limitation in the implementation, rather than an indication that try is not intended to have a return value. I'd like to suggest that try's semantics should be defined and clearly stated in S04/"Other do-like forms" as: Try returns either the block's normal return value or the relevant, unthrown exception. Thus the above IO example becomes (with no changes in the grammar): try { ?$!PIO.close(); Bool::True } The golfer in me wants ? to take an adverb that causes this to all be moot: ? :true $!PIO.close() but I know that's probably going a bit too far, and I'd like to think it's too late for that kind of feature request. =
Re: threads?
On Thu, Oct 21, 2010 at 6:04 PM, Darren Duncan wrote: > Aaron Sherman wrote: > >> > Things that typically precipitate threading in an application: >> >> - Blocking IO >> - Event management (often as a crutch to avoid asynchronous code) >> - Legitimately parallelizable, intense computing >> >> Interestingly, the first two tend to be where most of the need comes from >> and the last one tends to be what drives most discussion of threading. >> > > The last one in particular would legitimately get attention when one > considers that it is for this that the concern about using multi-core > machines efficiently comes into play. That sounds great, but what's the benefit to a common use case? Sorting lists with higher processor overhead and waste heat in applications that traditionally weren't processor-bound in the first place? Over the past 20+ years, I've seen some very large, processor-bound applications that could (and in some cases, did) benefit from threading over multiple cores. However, they were so far in the minority as to be nearly invisible, and in many cases such applications can simply be run multiple times per host in order to VERY efficiently consume every available processor. The vast majority of my computing experience has been in places where I'm actually willing to use Perl, a grossly inefficient language (I say this, coming as I do from C, not in comparison to other HLLs), because my performance concerns are either non-existent or related almost entirely to non-trivial IO (i.e. anything sendfile can do). > The first 2 are more about lowering latency and appearing responsive to a > user on a single core machine. Write me a Web server, and we'll talk. Worse, write a BitTorrent client that tries to store its results into a high performance, local datastore without reducing theoretical, back-of-the-napkin throughput by a staggering amount. Shockingly enough, neither of these frequently used examples are processor-bound. The vast majority of today's applications are written with network communications in mind to one degree or another. "The user," isn't so much interesting as servicing network and disk IO responsively enough that hardware and network protocol stacks wait on you to empty or fill a buffer as infrequently as possible. This is essential in such rare circumstances as: - Database intensive applications - Moving large data files across wide area networks - Parsing and interpreting highly complex languages inline from data received over multiple, simultaneous network connections (sounds like this should be rare, but your browser does it every time you click on a link) Just in working with Rakudo, I have to use git, make and Perl itself, all of which can improve CPU performance all they like, but will ultimately run slow if they don't handle reading dozens of files, possibly from multiple IO devices (disks, network filesystems, remote repositories, etc) as responsively as possible. Now, to back up and think this through, there is one place where multi-core processor usage is going to become critical over the next few years: phones. Android-based phones are going multi-core within the next six months. My money is on a multi-core iPhone within a year. These platforms are going to need to take advantage of multiple cores for primarily single-application performance in a low-power environment. So, I don't want you to think that I'm blind to the need you describe. I just don't want you to be unrealistic about the application balance out there. I think that Perl 6's implicit multi-threading approach such as for > hyperops or junctions is a good best first choice to handle many common > needs, the last list item above, without users having to think about it. > Likewise any pure functional code. -- Darren Duncan > It's very common for people working on the design or implementation of a programming language to become myopic with respect to the importance of executing code as quickly as possible, and I'm not faulting anyone for that. It's probably a good thing in most circumstances, but in this case, assuming that the largest need is going to be the execution of code turns out to be a misleading instinct. Computers execute code far, far less than you would expect, and the cost of failing to service events is often orders of magnitude greater than the cost of spending twice the number of cycles doing so. PS: Want an example of how important IO is? Google has their own multi-core friendly network protocol modifications to Linux that have been pushed out in the past 6 months: http://www.h-online.com/open/features/Kernel-Log-Coming-in-2-6-35-Part-3-Network-support-1040736.html They had to do this because single cores can no longer keep up with the network.
Re: threads?
On Tue, Oct 12, 2010 at 10:22 AM, Damian Conway wrote: > Perhaps we need to think more Perlishly and reframe the entire question. > Not: "What threading model do we need?", but: "What kinds of non-sequential > programming tasks do we want to make easy...and how would we like to be > able to specify those tasks?" > Things that typically precipitate threading in an application: - Blocking IO - Event management (often as a crutch to avoid asynchronous code) - Legitimately parallelizable, intense computing Interestingly, the first two tend to be where most of the need comes from and the last one tends to be what drives most discussion of threading. Perhaps it would make more sense to discuss Perl 6's event model (glib, IMHO, is an excellent role model, here -- http://en.wikipedia.org/wiki/Event_loop#GLib_event_loop ) and async IO model before we deal with how to sort a list on 256 cores...
Re: threads?
I've done quite a lot of concurrent programming over the past 23ish years, from the implementation of a parallelized version of CLIPS back in the late 80s to many C, Perl, and Python projects involving everything from shared memory to process pooling to every permutation of hard and soft thread management. To say I'm rusty, however, would be an understatement, and I'm sure my information is sorely out of date. What I can contribute to such a conversation, however, is this: - Make the concept of "process" and "thread" an implementation detail rather than separate worlds and your users won't learn to fear one or the other. - If the programmer has to think about semaphore management, there's already a problem. - If the programmer's not allowed to think about semaphore management, there's already a problem. - Don't paint yourself into a corner when it comes to playing nice with local interfaces. - If your idea of instantiating a "thread" involves creating a on OS VM, then you're probably lighter weight than Python's threading model, but I'd suggesting parring it down some more. It's "thread," not "ringworld" (I was going to say "not 'space elevator,'" but it seemed insufficient to the examples I've seen). I know that's pretty high-level, but it's what I've got. I think I wrote my last threaded application in 2007.
Re: Buf.pm: FIFO and grammar
On Fri, Aug 13, 2010 at 8:11 PM, Jon Murray wrote: > My understanding from synopses was that you get the Perl 5 behaviour if > you omit the signature on your function declaration (though I > unfortunately can't check as I don't have Rakudo installed): > > sub foo { @_[0] = 1 } > my $a = 0; > foo($a); > say $a; # 0 > > Nope. In fact, as you indicated in the comment you left in, that prints 0 just like the first example. In neither case is $a modified. In Perl 5, on the other hand, the passed value can be modified: $ ./perl6 -e 'sub foo { @_[0] = 1 } ; my $a = 0; foo($a); say $a' 0 $ perl -le 'sub foo { $_[0] = 1 } my $a = 0; foo($a); print $a' 1 You might well be correct about how it's supposed to work, but that's certainly not the current behavior. On Fri, 2010-08-13 at 12:06 -0400, Aaron Sherman wrote: > > On Fri, Aug 13, 2010 at 11:27 AM, Jonathan Worthington > > wrote: > > > > > > > >> > > > I saw a video camera in the room, but not sure when we'll be seeing the > > > footage from that. In the meantime, the slides are at: > > > > > > http://www.jnthn.net/papers/2010-yapc-eu-signatures.pdf > > > > > > > > Nice talk! One minor nit, and perhaps I'm just misunderstanding some > subtle > > use of the terminology, but you say: > > > > "In Perl 5, you get a copy of the arguments to work with in @_." > > > > However, this isn't true (again, unless I'm misunderstanding you). @_ is > a > > by-reference list of positional parameters (Perl 5 only has positionals) > > which are all read-write, which it's interesting to note, is impossible > in > > Perl 6... well, at least in Rakudo, as I'm not sure what the behavior is > > supposed to be, but a slurpy positional list in Rakudo that's declared > "is > > rw" does not change the values passed in: > > > > sub foo(*...@_ is rw) { @_[0] = 1 } > > my $a = 0; > > foo($a); > > say $a; # 0 > > > > Kind of interesting that you can't easily emulate Perl 5's parameter > > passing... > > > > > -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: Buf.pm: FIFO and grammar
On Fri, Aug 13, 2010 at 11:27 AM, Jonathan Worthington wrote: > >> > I saw a video camera in the room, but not sure when we'll be seeing the > footage from that. In the meantime, the slides are at: > > http://www.jnthn.net/papers/2010-yapc-eu-signatures.pdf > > Nice talk! One minor nit, and perhaps I'm just misunderstanding some subtle use of the terminology, but you say: "In Perl 5, you get a copy of the arguments to work with in @_." However, this isn't true (again, unless I'm misunderstanding you). @_ is a by-reference list of positional parameters (Perl 5 only has positionals) which are all read-write, which it's interesting to note, is impossible in Perl 6... well, at least in Rakudo, as I'm not sure what the behavior is supposed to be, but a slurpy positional list in Rakudo that's declared "is rw" does not change the values passed in: sub foo(*...@_ is rw) { @_[0] = 1 } my $a = 0; foo($a); say $a; # 0 Kind of interesting that you can't easily emulate Perl 5's parameter passing... -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: Buf.pm: FIFO and grammar
On Thu, Aug 12, 2010 at 5:47 AM, Carl Mäsak wrote: > Oha (>): > > > * Grammars define a hierarchical structure that seems to be perfect > for encoding the packing of larger pieces of data, for example when > serializing an object structure. Could one use grammars, or something > very much like it, as a "modern" &pack template? > > A while back we had a fairly productive conversation about matching regexes on non-textual data: http://groups.google.com/group/perl.perl6.language/msg/24f23fdfc0c5d459?hl=en My proposal there is incomplete, but matching structured data (even if that structure is just a sequence of non-textual bytes) using rules definitely makes sense to me. The real question is: what is the least invasive way to do it? Modifying nqp for this purpose would, I think, not make sense. I think we'd need a pure Perl implementation of the rules engine that could match either text or data, and that's a gigantic undertaking. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: pattern alternation (was Re: How are ...)
On Thu, Aug 5, 2010 at 2:43 PM, Tyler Curtis wrote: > On Thu, Aug 5, 2010 at 12:28 PM, Aaron Sherman wrote: > > While that's a nifty special case (I'm sure it will surprise me someday, > and > > I'll spend a half hour debugging before I remember this mail), it doesn't > > help in the general case (see my example grammar, below). > > In the general case, no. In the case of your grammar, and all > grammars, it does help. > > All regex routines, when called standalone, are anchored to the > beginning and end of the string. So, having "^" and "$" at the > beginning and end of your TOP is a no-op unless some other rule calls > it as a subrule. > There's something deeply disturbing to me in that... but I can't fully express what it is. It just feels like I'm going to end up debugging mountains of code, written by people who didn't understand that that was the case. Several times over the past few weeks, I've mentioned something on this list only to find that, buried somewhere deep in a synopsis, there was a special case I was unaware of. The sheer volume of silent special cases in Perl 6 appears to be dwarfing that of Perl 5, but perhaps that's just because I know Perl 5 far better than I know Perl 6. Mind you, I'm not complaining, so much as working out how I feel out loud Am I the only one who feels this way at this point? > :oneline or similar might be useful. I'm not sure about :rootedend and > :rootedstart. Are you saying that you can't think of examples of where you want to root a regex only to the start or end, or that you just don't think you need an adverb to do it? If the former, then I submit the 1536 examples of matching only at the end of strings in my local Perl library (mostly for matching whitespace or filename extensions it looks like) and the 3199 examples of matching only at the start which includes headers of all types (RFC2822 and friends, HTTP, CPAN configs, etc.), whitespace, command sequence matching (e.g. /^GET /) and so on. If the latter, then I guess you and I just have a different take, here, and that's fine. I respect your opinion, but in this case, I happen to disagree. PS: You can also search through any typical python install for "\.match" which will yield quite a lot of additional examples. I don't know Ruby or Java very well, or I'd go looking for examples there too. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: pattern alternation (was Re: How are ...)
On Thu, Aug 5, 2010 at 11:09 AM, Patrick R. Michaud wrote: > On Thu, Aug 05, 2010 at 10:27:50AM -0400, Aaron Sherman wrote: > > On Thu, Aug 5, 2010 at 7:55 AM, Carl Mäsak wrote: > > > I see this particular thinko a lot, though. Maybe some Perl 6 lint > > > tool or another will detect when you have a regex containing ^ at its > > > start, $ at the end, | somewhere in the middle, and no [] to > > > disambiguate. > > > > You know, this problem would go away, almost entirely, if we had a > :f[ull] > > adverb for regex matching that imposed ^[...]$ around the entire match. > Then > > your code becomes: > > > > m:f/<[A..Z]>+|<[a..z]>+/ > > There's a version of this already. Matching against an explicit 'regex', > 'token', or 'rule' automatically anchors it on both ends. Thus: > >$string ~~ regex { <[A..Z]>+ | <[a..z]>+ } > > is equivalent to > >$string ~~ regex { ^ [ + | <[a..z]>+ ] $ } > > While that's a nifty special case (I'm sure it will surprise me someday, and I'll spend a half hour debugging before I remember this mail), it doesn't help in the general case (see my example grammar, below). After doing some more thinking and comparing this to other languages (python, for example has "match" which matches only at the start of a string), it seems to me that there is a sort of out-of-band need to have a more general solution at match time. Here's my second pass suggestion: m:r / m:rooted -- Match is rooted on both ends ("^...$") m:rs / m:rootedstart - Match is rooted at the start of string ("^", ala Python re.match) m:re / m:rootedend - Match is rooted at the end of string ("$") m:rn / m:rootednone - Match is not rooted (default) m:o / m:oneline - Modify :r and friends to use ^^/$$ Here's one way I can see that being routinely used: # Simplistic shell scripts rule TOP :r {*} # Match the whole script rule stmt :r :o { * } # One statement per line The other way to go about that would be with parameterized adverbs. I'm not sure how comfy people are with those, but they're in the spec. So this: m:r / m:rooted -- Match is rooted (default is ^...$) Parameters: :s / :start -- Match is rooted only at start ("^") :e / :end -- Match is rooted only at end ("$") [note: :s :e should produce a warning] :n / :none -- Match is not rooted (null modifier) [note: combining :n with :s or :e should warn] :o / :oneline -- Use ^^ and $$ instead of ^ and $ [note: combining :o with :n should warn?] So our statement matching grammar becomes: rule TOP :r {*} rule stmt :r(:o) { * } The clown nose is just a side benefit ;-) Seriously, though, I prefer :r(:o) because :r:o looks like it should be the opposite of :rw (there is no :ro, as far as I know). PS: I see no reason that any of this is needed for 6.0.0 -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: pattern alternation (was Re: How are ...)
On Thu, Aug 5, 2010 at 7:55 AM, Carl Mäsak wrote: > Darren (>>>>), Carl (>>>), Darren (>>), Patrick (>): > > > In this case yes -- the original pattern without the square brackets > > would act like: > > > >/ [^ <[A..Z]>+] | [<[a..z]>+ $] / > > > > In other words, the original pattern says "starting with uppercase > > or ending with lowercase". > > I see this particular thinko a lot, though. Maybe some Perl 6 lint > tool or another will detect when you have a regex containing ^ at its > start, $ at the end, | somewhere in the middle, and no [] to > disambiguate. > > You know, this problem would go away, almost entirely, if we had a :f[ull] adverb for regex matching that imposed ^[...]$ around the entire match. Then your code becomes: m:f/<[A..Z]>+|<[a..z]>+/ for grins, :f[ull]l[ine] could use ^^ and $$. I suspect :full would almost always be associated with TOP, in fact. Boy am I tired of typing ^ and $ in TOP ;-) -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: How are unrecognized options to built-in pod block types treated?
On Wed, Aug 4, 2010 at 10:05 PM, Damian Conway wrote: > Darren suggested: > > > Use namespaces. > > The upper/lower/mixed approach *is* a > namespace approach. > It's a very C-like approach, but yes, it's certainly a crude sort of namespace. Perl already has a more robust and modern namespace system, however. Using it would seem wise. > > Explicit versioning is your friend. > > > > Can I get some support for this? > > Not from me. ;-) > > I think it's a dreadful prospect to allow people to > write documentation that they will have to rewrite when > the Pod spec gets updated. I would hope... really, desperately hope that the POD spec changing would be the least of anyone's worries. If you're writing documentation, it's a foregone conclusion that it has to be maintained, just like any other part of your software. If the POD spec is adding new config options at a rate that isn't several orders of magnitude less than the frequency with which your code changes then either you're documenting the Magna Carta or we have a problem with our documentation system. If the latter is the case, then the right solution is to provide new documentation features via modules and allow the user to select which new features they desire, automatically resolving the problem, since old docs simply won't pull in newer features. This could go both ways, as well. "use v6" might get you the default first-pressing documentation features of Perl 6.0.0 while "use v6.1" might get you the default features of 6.1. Then you could mix it up: use v6; use Docs::SectionImage; > Or, alternatively, to require all > Pod parsers to be infinitely backwards compatible across > all versions. :-( > If you never want documentation to break, then that's your only option. Someday we're going to decide to make an incompatible change to Perl's documentation system, and we'll have a very good reason to do so, I'd imagine. The right thing to do will be to make sure that we roll it out carefully and with all due warning. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: How are unrecognized options to built-in pod block types treated?
On Wed, Aug 4, 2010 at 6:00 PM, Carl Mäsak wrote: > Straight to an example: > >=for head1 :image >Steaming hot C loops > Interesting that this comes up right as I was composing my "help" email ;) > > I went looking for whether this is allowed or not. Is this allowed? > S26 only tells me this about config options: > > "Pod predefines a small number of standard configuration options that > can be applied uniformly to any built-in block type." > > To me, "predefines" could mean either "we made these for you; use only > those" or "we made these for you; go wild and invent your own too if > you wish". > I see no reason for it to not simply store any additional values away for potential future use. > > It also has this to say about block types: > > "Typenames that are entirely lowercase (for example: C<=begin head1>) > or entirely uppercase (for example: C<=begin SYNOPSIS>) are reserved." > > But it's clear from the context of that sentence that this only > pertains to blocks. There's no indication that this goes for the > config options as well. > I dislike "reserved" in this context, but understand why the namespace has to be shared. For config options, I'd say anything should go, but people inventing their own config options should be aware that across N release cycles, new options may be introduced. THAT in turn means that we need a way to add config options in a point release in order to push out new features in reasonable time, and that means they need their own namespace. I'd suggest: =for head1 :reserved('image=foo.jpg') which is identical to: =for head1 :image('foo.jpg') except for the fact that any unrecognized option of the first form is an error and any unrecognized option of the second form is allowed. That way, new features can be added to :reserved and migrated over time to stand-alone options after being listed in the release notes for a couple of cycles. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: Smart match isn't on Bool
On Tue, Aug 3, 2010 at 3:30 PM, David Green wrote: > On 2010-08-02, at 2:35 pm, TSa (Thomas Sandlaß) wrote: > > On Monday, 2. August 2010 20:02:40 Mark J. Reed wrote: > >> [...] it's at least surprising. I'd expect (anything ~~ True) to be > synonymous with ?(anything) > > Note also that ($anything ~~ foo()) just throws away $anything. > > No; only if foo() returns a Bool. How do you define that? Any type which is compatible with Bool? What happens if, in a bold and innovative stroke, some future revision of Perl 6 decides that Type is compatible with Bool? Who wins in "$a ~~ Int"? If my function returns "1 but False" will "1 ~~ func()" match? I guess what I'm really asking, here, is do we use the first rule in the smart-match table from S03 or is there some other priority system at work? Do we consider only exact type matches and shunt everything else into Any or do we call X.does()? On a related note, I re-read S03 a bit and I've come to the conclusion that the ~~ op (not the implicit smart match of given/when) should produce a warning when its RHS is Bool (assuming we resolve what "is Bool" means, above). There's just no good reason to confuse the issue that way, and the Bool behavior of smart-matching is really just there to support given/when. Beyond that, once we start getting into deeper warnings on semantic mistakes, it's likely that it will make sense to warn at run-time if when is asked to work on a boolean value which isn't the result of some form of comparison. Again, it's just confusing for no benefit. If we add the above warnings and the ones already listed in S03, then I think I'll be fine with it. I understand why we want "when .foo == .bar" and I can't think of a good way to replace it, so I'll buy that it's worth making so many other obvious uses deprecated. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: Natural Language and Perl 6
On Sun, Aug 1, 2010 at 6:46 AM, Timothy S. Nelson wrote: >Hi. I'm wondering if any thought has been given to natural language > processing with Perl 6 grammars. > > Yes. ;) -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: Smart match isn't on Bool
On Mon, Aug 2, 2010 at 2:02 PM, Mark J. Reed wrote: > On Sun, Aug 1, 2010 at 6:02 PM, Jonathan Worthington > wrote: >> No, given-when is smart-matching. The RHS of a smart-match decides what >> happens. If you do True ~~ 1 then that's 1.ACCEPTS(True) which is going to >> do +True and thus match. > > OK, but what about 0 ~~ True? That's what started this thread, > extricated from the complicating trappings of given/when. Right now, > (anything ~~ True) is true, and IMHO that's a misfeature; it's at > least surprising. I'd expect (anything ~~ True) to be synonymous with > ?(anything): true only if the LHS boolifies to true. By the same > token, (anything ~~ False) would be synonymous with !?(anything). Again, sorry for starting a long thread (I seem to do that, and I swear I'm not trying... just pointing out the surprises I run into as I try to code). I want to stress that what you've said above is kind of moot: the spec says that ~~True gets a parser warning, so we can consider that deprecated. The only usage we're supporting, here, is a Bool value stored in a variable or otherwise generated. I would argue that that's even worse. For example: my Bool $trash-status = take-out-the-trash(); ... some time later ... my Bool $dishes-status = wash-the-dishes(); if !$dishes-status && $dishes-status ~~ $trash-status { say "No chores this week!"; } Of course, that's a bug, but imagine the poor maintenance programmer that tries to figure out what's going on. I feel for him/her. The only advantage he/she will have is that this is likely to be so common an error that they'll quickly learn to look for it first when smart-matching is involved :-/ -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: Smart match isn't on Bool
On Sat, Jul 31, 2010 at 12:56 PM, David Green wrote: > > On 2010-07-30, at 4:57 pm, Aaron Sherman wrote: > >> given False { when True { say "True" } when False { Say "False" } > default { say "Dairy" } } > >> I don't think it's unreasonable to expect the output to be "False". > >> However, it actually outputs "True". Why? Well, because it's in the spec > that way. So... why is it in the spec that way? > > Well, if you want to do a boolean test, you'd probably use "if" instead; I'm sorry, I didn't know I'd sent this message to the python-language list ;-) Seriously though, I don't want "there's more than one way to do it" to be an excuse for redundancy, but on the other hand, it's rather odd for a rationale in Perl to be "there's already a way to do that" rather than the intuitiveness of the feature, even when it's only situational. It's also quite useful for testing truth as a fallback: given $_ { when $motor-oil { say "mmm syrupy!" } when $methane { say "dangerou!" } when $jet-fuel { say "I can haz skaiz?" } when /crude/ { say "refine it first" } when False { say "Sorry, no petroleum products" } when True { say "Unknown petroleum products" } default { say "Unpossible!" } } but something that already gives you a Bool, like "when time>$limit", is > likely to be the result you want to test itself rather than comparing it > against $_ (which is likely not to be a Bool). > My problem with that is that it's a really odd use of given/when, and given the implicit smart-match, it doesn't make much sense. Now, to slightly backtrack, I do agree that there should be at least one way to do something, and if that were the only way to perform independent tests within a given, I'd agree. Thankfully, it's not: given $_ { when /clock/ { say "clock" } if time > $limit { say "tick" } default { say "tock" } } And the really nice thing about that usage is that you immediately see that we're not testing time with respect to $_, but with respect to $limit. If you use when, that's left rather confusingly ambiguous unless you know that boolean values are a special case. > So Perl is trying to be helpful by doing something useful instead of making > the useful thing much harder at the expense of something that isn't useful > anyway. Well, since it's easy to do both, as demonstrated above, I think we can agree that we've satisfied the first rule. But given isn't very interesting compared to smart matching in general, and that's where: $foo ~~ True Really does seem to me as a very intuitive question, and that question isn't "is the RHS true?" > The catch is that I think that comparing against a boolean IS useful. The > fact that this question keeps coming up, even on the p6l list, seems to > demonstrate that the "helpful" way isn't completely natural or obvious (at > least, not to everyone). > Agreed. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: Suggested magic for "a" .. "b"
On Fri, Jul 30, 2010 at 6:45 PM, Doug McNutt wrote: > Please pardon intrusion by a novice who is anything but object oriented. No problem. Sometimes a fresh perspective helps to illuminate things. Skipping ahead... > Are you guise sure that the "..." and ".." operators in perl 6 shouldn't make > use of regular expression syntax while deciding just what is intended by the > programmer? You kind of blew my mind, there. I tried to respond twice and each time I determined that there was a way around what I was about to call crazy. In the end, I'm now questioning the difference between a junction and a Range... which is not where I thought this would go. Good question, though I should point out that you could never reasonably listify a range constructed from a regex because "reversing" a regex like that immediately runs into some awful edge cases. Still, interesting stuff. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Smart match isn't on Bool
In this code: given False { when True { say "True" } when False { Say "False" } default { say "Dairy" } } I don't think it's unreasonable to expect the output to be "False". However, it actually outputs "True". Why? Well, because it's in the spec that way. So... why is it in the spec that way? I gave it some thought, and I can't imagine a reason that smart-matching against True and False wouldn't be like smart-matching against any other value type. That is, it should simply return $_ === True or $_ === False. Instead, a smart match always returns its right hand side argument when that argument is Bool. I'd suggest re-writing S03 like so: $_ XType of Match Implied Match if (given $_) === = === ... Any Bool Test for boolean state ?$_ === ?X and remove the first two lines (True and False which say we should parsewarn). While I'm on the topic... Why does "but" affect "===" on high level types which define a WHICH method (for those unaware, === compares value types, but when you use it for high level, user-defined types, it invokes .WHICH on its LHS and RHS and compares the results recursively). $ ./perl6 -e 'class R{ method WHICH() { 1 } } ; say R.new() === R.new()' 1 $ ./perl6 -e 'class R{ method WHICH() { 1 } } ; say R.new() === R.new() but False' 0 It doesn't seem to change the WHICH method: $ ./perl6 -e 'class R{ method WHICH() { 1 } } ; say (R.new but False).WHICH' 1 So what *is* changing? Oh and PPS: a kind of Rakudo bug: $ ./perl6 -e 'class R{ method WHICH() { self } } ; say R.new() === R.new()' Segmentation fault Clearly this is infinitely recursive, but one imagines it would be easy enough to put a maximum recursion depth on ===. I was about to say that === should check to see if X.WHICH eqv X, but I think that would slow things down too much. Setting a max recursion depth, on the other hand would be simple and fast. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: Array membership test?
I may be misunderstanding something. I haven't really looked into list searching much, but there seem to be some very odd and unexpected results, here. On Thu, Jul 29, 2010 at 7:52 PM, Jonathan Worthington wrote: >> my @x = 1,2,3; say ?...@x.grep(2); say ?...@x.grep(4); > 1 > 0 > > Though more efficient would be: > >> my @x = 1,2,3; say ?...@x.first(2); say ?...@x.first(4); > 1 > 0 The above only works if the value that you are searching for does something sane when it's the RHS of a smart-match. Try searching for False this way and you'll be sad, just for example (PS: I think the behavior of False ~~ False is unsmart). Actually, it looks like === is the right way to do this for value types like 2 and True, but right now Rakudo doesn't do the right thing: my @x = 1,2,3,False; say ?...@x.first: * === 2; say ?...@x.first: * === False; say ?...@x.first: * === True; 1 0 1 I think it's just smart-matching, which is definitely not correct (False === False does do the right thing when you executed it on its own, though). If you really want odd, try: say [1,2,3].first: * === True; Result: 1 and say [5,2,3].first: * === True; Result: Rakudo exits silently with no newline So, the right way to search for value types in a list... is highly questionable right now. ;-) -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: Suggested magic for "a" .. "b"
n exception when you try to tell Perl that " TOPIXコンポジット1500構成銘柄" is a Japanese string... but then Perl is rejecting strings that are considered valid in some contexts within that language. My only strongly held belief, here, is that you should not try to answer any of these questions for the default range operator on unadorned, context-less strings. For that case, you must do something that makes sense for all Unicode codepoints in nearly all contexts. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: Suggested magic for "a" .. "b"
On Wed, Jul 28, 2010 at 6:24 PM, Dave Whipp wrote: > Aaron Sherman wrote: > >> On Wed, Jul 28, 2010 at 11:34 AM, Dave Whipp >> wrote: >> >> To squint at this slightly, in the context that we already have 0...1e10 >>> as >>> a sequence generator, perhaps the semantics of iterating a range should >>> be >>> unordered -- that is, >>> >>> for 0..10 -> $x { ... } >>> >>> is treated as >>> >>> for (0...10).pick(*) -> $x { ... } >>> >>> >> As others have pointed out, this has some problems. You can't implement >> 0..* >> that way, just for starters. >> > > I'd say that' a point in may favor: it demonstrates the integers and > strings have similar problems. If you pick items from an infinite set then > every item you pick will have an infinite number of digits/characters. > So, if I understand you correctly, you're happy about the fact that iterating over and explicitly lazy range would immediately result in failure? Sorry, not following. > > In smart-match context, "a".."b" includes "aardvark". No one has yet explained to me why that makes sense. The continued use of ASCII examples, of course, doesn't help. Does "a" .. "b" include "æther"? This is where Germans and Swedes, for example, don't agree, but they're all using the same Latin code blocks. I don't think you can reasonably bring locale into this. I think it needs to be purely a codepoint-oriented operator. If you bring locale into it, then the argument for not including composing an modifying characters goes out the window, and you're stuck in what I believe Dante called "the Unicode circle." If you treat this as a codepoint-based operator then you get a very simple result: "a".."b" is the range between the codepoint for "a" and the codepoint for "b". "aa" .. "bb" is the range between a sequence of two codepoints and a sequence of two other code points, which you can define in a number of ways (we've discussed a few, here) which don't involve having to expand the sequences to three or more codepoints. I've never accepted that the range between two strings of identical length should include strings of another length. That seems maximally non-intuitive (well, I suppose you could always return the last 100 words of Hamlet as an iterable IO object if you really wanted to confuse people), and makes string and integer ranges far too divergent. > Then the whole question of reversibility is moot. >>> >> Really? I don't think it is. In fact, you've simply made the problem pop >> up >> everywhere, and guaranteed that .. must behave totally unlike any other >> iterator. >> > > %hash.keys has similarly unordered semantics. Unordered semantics and shuffled values aren't the same thing. The reason that hash keys are unordered is that we cannot guarantee that any given implementation will store entries in any given relation to the input. Ranges have a well defined ordering associated with the elements that fall within the range by virtue of the basic definition of a range (LHS <= * <= RHS). Hashes have no ordering associated with their keys (though one can be imposed, e.g. by sort). Therefore %hash.keys.reverse is, for most purposes, equivalent to > %hash.keys. Argh! No, that's entirely untrue. %hash.keys and %hash.keys.reverse had better be the same elements, but reversed for all hashes which remain unmodified between the first and second call. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: Suggested magic for "a" .. "b"
On Wed, Jul 28, 2010 at 6:24 PM, Dave Whipp wrote: > Aaron Sherman wrote: > >> On Wed, Jul 28, 2010 at 11:34 AM, Dave Whipp >> wrote: >> >> To squint at this slightly, in the context that we already have 0...1e10 >>> as >>> a sequence generator, perhaps the semantics of iterating a range should >>> be >>> unordered -- that is, >>> >>> for 0..10 -> $x { ... } >>> >>> is treated as >>> >>> for (0...10).pick(*) -> $x { ... } >>> >>> >> As others have pointed out, this has some problems. You can't implement >> 0..* >> that way, just for starters. >> > > I'd say that' a point in may favor: it demonstrates the integers and > strings have similar problems. If you pick items from an infinite set then > every item you pick will have an infinite number of digits/characters. > So, if I understand you correctly, you're happy about the fact that iterating over and explicitly lazy range would immediately result in failure? Sorry, not following. > > In smart-match context, "a".."b" includes "aardvark". No one has yet explained to me why that makes sense. The continued use of ASCII examples, of course, doesn't help. Does "a" .. "b" include "æther"? This is where Germans and Swedes, for example, don't agree, but they're all using the same Latin code blocks. I don't think you can reasonably bring locale into this. I think it needs to be purely a codepoint-oriented operator. If you bring locale into it, then the argument for not including composing an modifying characters goes out the window, and you're stuck in what I believe Dante called "the Unicode circle." If you treat this as a codepoint-based operator then you get a very simple result: "a".."b" is the range between the codepoint for "a" and the codepoint for "b". "aa" .. "bb" is the range between a sequence of two codepoints and a sequence of two other code points, which you can define in a number of ways (we've discussed a few, here) which don't involve having to expand the sequences to three or more codepoints. I've never accepted that the range between two strings of identical length should include strings of another length. That seems maximally non-intuitive (well, I suppose you could always return the last 100 words of Hamlet as an iterable IO object if you really wanted to confuse people), and makes string and integer ranges far too divergent. > Then the whole question of reversibility is moot. >>> >> Really? I don't think it is. In fact, you've simply made the problem pop >> up >> everywhere, and guaranteed that .. must behave totally unlike any other >> iterator. >> > > %hash.keys has similarly unordered semantics. Unordered semantics and shuffled values aren't the same thing. The reason that hash keys are unordered is that we cannot guarantee that any given implementation will store entries in any given relation to the input. Ranges have a well defined ordering associated with the elements that fall within the range by virtue of the basic definition of a range (LHS <= * <= RHS). Hashes have no ordering associated with their keys (though one can be imposed, e.g. by sort). Therefore %hash.keys.reverse is, for most purposes, equivalent to > %hash.keys. Argh! No, that's entirely untrue. %hash.keys and %hash.keys.reverse had better be the same elements, but reversed for all hashes which remain unmodified between the first and second call. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: Suggested magic for "a" .. "b"
On Wed, Jul 28, 2010 at 11:34 AM, Dave Whipp wrote: > To squint at this slightly, in the context that we already have 0...1e10 as > a sequence generator, perhaps the semantics of iterating a range should be > unordered -- that is, > > for 0..10 -> $x { ... } > > is treated as > > for (0...10).pick(*) -> $x { ... } > As others have pointed out, this has some problems. You can't implement 0..* that way, just for starters. > Then the whole question of reversibility is moot. Really? I don't think it is. In fact, you've simply made the problem pop up everywhere, and guaranteed that .. must behave totally unlike any other iterator. Getting back to 10..0... The complexity of implementation argument doesn't really hold for me, as: (a..b).list = a>b ?? a,*.pred ... b !! a,*.succ ... b Is pretty darned simple and does not require that b implement anything more than it does under the current implementation. a, on the other hand, now has to (optionally, since throwing an exception is the alternative) implement one more method. The more I look at this, the more I think ".." and "..." are reversed. ".." has a very specific and narrow usage (comparing ranges) and "..." is probably going to be the most broadly used operator in the language outside of quotes, commas and the basic, C-derived math and logic ops. Many (most?) loops will involve "...". Most array initializers will involve "...". Why are we not calling that ".."? Just because we defined ".." first, and it grandfathered its way in the door? Because it resembles the math op? These don't seem like good reasons. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: Suggested magic for "a" .. "b"
Sorry I haven't responded for so long... much going on in my world. On Mon, Jul 26, 2010 at 11:35 AM, Nicholas Clark wrote: > On Tue, Jul 20, 2010 at 07:31:14PM -0400, Aaron Sherman wrote: > > > 2) We deny that a range whose LHS is "larger" than its RHS makes sense, > but > > we also don't provide an easy way to construct such ranges lazily > otherwise. > > This would be annoying only, but then we have declared that ranges are > the > > right way to construct basic loops (e.g. for (1..1e10).reverse -> $i > {...} > > which is not lazy (blows up your machine) and feels awfully clunky next > to > > for 1e10..1 -> $i {...} which would not blow up your machine, or even > make > > it break a sweat, if it worked) > > There is no reason why for (1..1e10).reverse -> $i {...} should *not* be > lazy. > > As a special case, perhaps you can treat ranges as special and not as simple iterators. To be honest, I wasn't thinking about the possibility of such special cases, but about iterators in general. You can't generically reverse lazy constructs without running afoul of the halting problem, which I invite you to solve at your leisure ;-) For example, let's just tie it to integer factorization to make it really obvious: # Generator for ranges of sequential, composite integers sub composites(Int $start) { gather do { for $start .. * -> $i { last if isprime($i); take $i; } } } for composites(10116471302318).reverse -> $i { say $i } The first value should be 10116471302380, but computing that without iterating through the list from start to finish would require knowing that none of the integers between 10116471302318 and 10116471302380, inclusive, are prime. Of course, the same problem exists for any iterator where the end condition or steps can't be easily pre-computed, but this makes it more obvious than most. That means that Range.reverse has to do something special that iterators in general can't be relied on to do. Does that introduce problems? Not big ones. I can definitely see people who are used to "for ($a .. $b).reverse -> ..." getting confused when "for @blah.reverse -> ..." blows up their machine, but avoiding that confusion might not be practical. PS: On a really abstract note, requiring that ($a .. $b).reverse be lazy will put new constraints on the right hand side parameter. Previously, it didn't have to have a value of its own, it just had to be comparable to other values. for example: for $a .. $b -> $c { ... } In that, we don't include the RHS in the output range explicitly. Instead, we increment a $a (via .succ) until it's >= $b. If $a were 1 and $b were an object that "does Int" but just implements the comparison features, and has no fixed numeric value, then it should still work (e.g. it could be random). Now that's not possible because we need to use the RHS a the starting point when .reverse is invoked. I have no idea if that matters, but it's important to be aware of when and where we constrain the interface rather than discovering it later. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: series operator issues
On Thu, Jul 22, 2010 at 4:52 PM, Jon Lang wrote: > I do have to admit that that's awfully clean-looking, but the > implementation > > would force a closure in a series to behave differently from a closure > > anywhere else. > > How so? > Unlike some of you, I haven't managed to memorize all of the synopses. For poor dolts like me who have only read some sections once, it would be nice if you could clarify the more obscure syntax ;-) > > Without changing closure definitions and without extending the syntax > any, > > you could make the series operator do a little bit more introspection > work > > and if a parameter is named "index", track an index value and pass it by > > name, passing any remaining parameters positionally from the previous n > > values as normal. > > ...which differs from my example in several ways, all of which are > detrimental: it puts the index in among the positional parameters, > No, that's not true. Sure, if you used the syntax I used, then it's allowed to be passed either way, but since the series operator will always pass it by name, the only positionals are the remaining parameters (I did test this out before sending my mail, just to verify that $^x could be passed as :x<...> or as a positional). More importantly the syntax you used works just as well, and as far as ... is concerned, there's no substantial difference. So... what are you suggesting? That any named-only parameter is passed the index? Or that a named-only parameter called "i" is passed the index? If the latter, then you're suggesting the same thing as I am, but with a different name (I prefer the longer name, given the restrictions it places on the closure). If you're suggesting that this apply to any named-only parameter, I don't think that's a good idea. That's even MORE restrictive than what I suggested (remember, it's usually going to be a closure, defined right there, but even the Synopsis gives one example of using an existing subroutine). > meaning that odd things would have to happen if you ever decide to > use, say, $^i and $^j as your "prior items" parameters; Why? > and it locks > you into a specific name for the index instead of letting you choose > one of your own liking. > Well, true, but you have to have some convention, and a name is a common way to establish such conventions in a parameter-passing API... Essentially, my suggestion is this: if the step function's signature > (or implied signature, in the case of a function with placeholder > variables) includes any named parameters, You meant "named only" > then the index is used as > the argument corresponding to the first one. Named only... first... these terms are non-miscible, aren't they? I don't think named-only parameters have an ordering. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: series operator issues
On Thu, Jul 22, 2010 at 1:13 PM, Jon Lang wrote: > > I also think it's doable without a special tool: > > > > 0, { state $i = 1; $^a + $i++ } ... * > > Kludgey; but possibly doable. > Well, it's kind of what state is there for. > > But what I'd really like to see would be for the index to be passed > into the step function via a named parameter. Of course, you say "the index" as if there is such a thing. In reality, there's no reason for the series operator to keep an index unless it's explicitly being indexed (e.g. by postcircumfix:<[]>) > Yes, it would be a > special tool; but it would be much more in keeping with the "keep > simple things easy" philosophy that Perl 6 tends to promote: > >0, { $^a + $:i } ... * # series of triangle numbers >0, { $^a + (2 * $:i - 1) } ... * # series of square numbers >{ $:i ** 2 } ... * # series of square numbers >1, { $^a * $:i } ... * # series of factorials I do have to admit that that's awfully clean-looking, but the implementation would force a closure in a series to behave differently from a closure anywhere else. Without changing closure definitions and without extending the syntax any, you could make the series operator do a little bit more introspection work and if a parameter is named "index", track an index value and pass it by name, passing any remaining parameters positionally from the previous n values as normal. That makes your examples: 0, { $^a + $^index } ... * 0, { $^a + (2 * $^index - 1) } ... * { $^index ** 2 } ... * 1, { $^a * $^index } ... * Not changing the syntax of closures seems like a reasonable goal at this late stage. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: series operator issues
On Thu, Jul 22, 2010 at 11:41 AM, Moritz Lenz wrote: > > The difficulty you're running into is that you're trying to use the wrong > tool for the job. Just don't use the series operator when it's not easy to > use. Perl 6 has other mechanism too, which are better suited for these > particular problems. > In general, I'd agree. However, there is something to be said for the underlying question: is there a way to get at the iteration index from the lambda in a series? It seems like that's something that it's not unreasonable to want. I also think it's doable without a special tool: 0, { state $i = 1; $^a + $i++ } ... * That should work, no? Granted, state doesn't seem to work in Rakudo, unless I'm mis-understanding how to use it, but that's the idea. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: Suggested magic for "a" .. "b"
On Wed, Jul 21, 2010 at 9:46 AM, Aaron Crane wrote: > > > I think that "Ā" .. "Ē" should ĀĂĄĆĈĊČĎĐĒ > > If that's in the hope of producing a more "intuitive" result, then why > not ĀB̄C̄D̄Ē? > > That's only partly serious. I'm acutely aware that choosing a baroque > set of rules makes life harder for both implementers and users (and, > in particular, risks ending up with an operator that has no practical > non-trivial use cases). > Well... actually, I got to thinking (which is not my natural state) and I think we need two approaches. I don't know if they're two operators, a pragma or what, but there are definitely two things people want: - "x".succ_uni yields "x".ord incremented until the resulting codepoint "agrees" with "x". By agrees, I mean that it shares the same script and general category properties (major/minor). This is an important tool because it's universal. - "x".succ_loc yields the next character after "x" in the current locale. What convinced me that this is a peer to the above was when I thought about Japanese, where only a subset of the CJK ideographs are valid Japanese. You really need an index and collation for these that is outside of the basic Unicode properties. So yes, if there's a locale in which ĀB̄C̄D̄Ē is the correct ordering, then I do think that there should be some "Ā" .. "Ē" equivalent that yields the above in that context. But, I'm not convinced it should be the default. > I note also that this A-macron and E-macron are in NFC. I think that, > certainly by default, the difference between NFC and NFD should be > hidden from users. That implies that, however "Ā" .. "Ē" behaves, the > NFD version should behave identically; and that "B̄" .. F̄ should > behave in the most equivalent way possible. > As I've said previously, I'm only discussing single "characters" which I'm defining as single codepoints which are neither combining nor modifying. If you like, we can have the conversation about what you do when you encounter combining and modifying codepoints, and I do think I agree with you largely, but I'd like to hold that for now. It's just too much of a rat-hole at this point. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
multi-character ranges
[changing the subject because it's now clear we have two different discussions on our hands. I think we're at or closing in on a consensus for "a" .. "z", and this discussion is "aa" .. "bb"] On Wed, Jul 21, 2010 at 1:56 AM, Darren Duncan wrote: > Aaron Sherman wrote: > >> 2) The spec doesn't put this information anywhere near the definition of >> the >> range operator. Perhaps we can make a note? This was a source of confusion >> for me. >> > > My impression is that a "Range" primarily defines an "interval" in terms of > 2 endpoint values such that it defines a possibly infinite set values > between those endpoints. > I don't think that has much to do with the fact that it was quite reasonable for me to look to the definition of ".." is S03 for what the range between two characters contains. 3) It seems that there are two competing multi-character approaches and both >> seem somewhat valid. Should we use a pragma to toggle behavior between A >> and >> B: >> >> A: "aa" .. "bb" contains "az" >> B: "aa" .. "bb" contains ONLY "aa", "ab", "ba" and "bb" >> > > I would find A to be the only reasonable answer. > [Before I respond, let's agree that, below, I'm going to say things like "generates" when talking about "..". What I'm describing is the idea that a value exists in the range given, not that a range is actually a list.] I would find B to be the only reasonable answer, but enough people seem to think the other way that I understand there's a valid need to be able to get both behaviors. > If you want B's semantics then use "..." instead; ".." should not be > overloaded for that. > I wasn't really distinguishing between ".." and "..." as I'm pretty sure they should have the same behavior, here. The case where I'm not sure they should have the same behavior is "apple" .. "orange". Frankly, I think that there's no right solution there. There's the one I proposed in my original message (treat each character index as a distinct sequence and then increment in a base defined by all of the sequences), but even I don't like that. To generate all possible strings of length 5+ that sort between those two is another suggestion, but then what do you expect "father-in-law" .. "orange" to do? Punctuation throws a whole new dimension in there, and I'm immediately lost. When you go to my Japanese example from many messages ago, which I got from a fairly typical Web site and contained 2 Scripts with 4 different General Categories, I begin to need pharmaceuticals. I don't see any value in having different rules for what .. and ... generate in these cases, however. (frankly, I'm still on the fence about ... for single endpoints, which I think should just devolve to .. (... with a list for LHS is another animal, of course)) > If there were to be any similar pragma, then it should control matters like > "collation", or what nationality/etc-specific subtype of Str the 'aa' and > 'bb' are blessed into on definition, so that their collation/sorting/etc > rules can be applied when figuring out if a particular $foo~~$bar..$baz is > TRUE or not. > For inclusion (e.g. does "aa" .. "zz" generate "cliché") see the single-character range discussion, which has already touched on locale issues. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: Suggested magic for "a" .. "b"
On Wed, Jul 21, 2010 at 1:28 AM, Aaron Sherman wrote: > > For reference, this is the relevant section of the spec: > > Character positions are incremented within their natural range for any > Unicode range that is deemed to represent the digits 0..9 or that is deemed > to be a complete cyclical alphabet for (one case of) a (Unicode) script. > Only scripts that represent their alphabet in codepoints that form a cycle > independent of other alphabets may be so used. (This specification defers to > the users of such a script for determining the proper cycle of letters.) We > arbitrarily define the ASCII alphabet not to intersect with other scripts > that make use of characters in that range, but alphabets that intersperse > ASCII letters are not allowed. > > > I'm not sure that all of that tracks with the Unicode standard's use of > some of the terms, but based on what we've discussed, perhaps we could get > more specific there: > > Character positions are incremented within their Unicode Script, but only > in keeping with their General Category property. Thus C<"A"++> yields C<"B"> > which is the next codepoint, but C<"Ă"++> yields C<"Ą"> even though "ą" > falls between the two, when incrementing codepoints. Should this prove > problematic for any specific Unicode Script which requires special handling > (e.g. because a "letter" really isn't used as a letter at all), such special > handling may be applied, but the above is the general rule. > > Oh, so close! I realized that I broke the original spec, here. We need to add back in: There are two special cases: the ASCII-compatible lower-case letters (a-z) and the ASCII-compatible upper-case letters (A-Z). For historical reasons, these, by default, will not increment past the end of their ranges into the higher-codepoint Latin characters. Note: we might want a pragma for that as well. I'd suggest that perhaps it should be a locale-specific feature? So, if you set your locale to fr, then you include in those ranges all of the Latin characters used in French. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: Suggested magic for "a" .. "b"
OK, there's a lot here and my head is swimming, so let me re-consolidate and re-state (BTW: thanks Jon, you've really helped me understand, here). 1) The spec is somewhat vague, but the proposal that I made for single characters is not an unreasonable interpretation of what's there. Thus, we could adopt the script/major cat/minor cat triplet as the core tool that .succ will use for single, non-combining, non-modifying, valid characters? 2) The spec doesn't put this information anywhere near the definition of the range operator. Perhaps we can make a note? This was a source of confusion for me. 3) It seems that there are two competing multi-character approaches and both seem somewhat valid. Should we use a pragma to toggle behavior between A and B: A: "aa" .. "bb" contains "az" B: "aa" .. "bb" contains ONLY "aa", "ab", "ba" and "bb" 4) About the ranges I gave as examples, you asked: "Which codepoint is invalid, and why?" There's just an undefined codepoint smack in the middle of the Greek uppercase letters (U+03A2). I'm sure the Unicode specs have a rationale for that somewhere, but my guess is that there's some thousand-year-old debate about the Greek alphabet behind it. "In both of these cases, what do you think it should produce?" I actually gave that answer a bit later on. I think that "Ā" .. "Ē" should produce ĀĂĄĆĈĊČĎĐĒ and オ .. ヺ should produce オカガキギクグケゲコゴサザシジスズセゼソゾタダチヂツヅテデトドナニヌネノハバパヒビピフブプヘベペホボポマミムメモヤユヨラリルレロワヰヱヲンヴヷヸヹヺ which are all of the Katakana syllabic characters. "I also have to wonder how or if "0" ... "z" ought to be resolved. If you're thinking in terms of the alphabet or digits, this is nonsensical" Well, since you agreed with my statement about the properties checking, it would be 0 through 9 and then a through z because 0 through 9 are Latin numbers, matching the LHS's properties and a through z are lowercase Latin letters, matching the RHS's properties. For reference, this is the relevant section of the spec: Character positions are incremented within their natural range for any Unicode range that is deemed to represent the digits 0..9 or that is deemed to be a complete cyclical alphabet for (one case of) a (Unicode) script. Only scripts that represent their alphabet in codepoints that form a cycle independent of other alphabets may be so used. (This specification defers to the users of such a script for determining the proper cycle of letters.) We arbitrarily define the ASCII alphabet not to intersect with other scripts that make use of characters in that range, but alphabets that intersperse ASCII letters are not allowed. I'm not sure that all of that tracks with the Unicode standard's use of some of the terms, but based on what we've discussed, perhaps we could get more specific there: Character positions are incremented within their Unicode Script, but only in keeping with their General Category property. Thus C<"A"++> yields C<"B"> which is the next codepoint, but C<"Ă"++> yields C<"Ą"> even though "ą" falls between the two, when incrementing codepoints. Should this prove problematic for any specific Unicode Script which requires special handling (e.g. because a "letter" really isn't used as a letter at all), such special handling may be applied, but the above is the general rule. and then in the section on ranges: As discussed previously, incrementing a character (which is to say, invoking C<.succ>) seeks the next codepoint with the same Unicode Script and General Category properties (major and minor category to be specific). For ranges, succession is the same if .min and .max have the same properties, but if they do not, then all codepoints are considered which are greater than C<.min> and smaller than C<.max> and which agree with either the properties of C<.min> I the properties of C<.max>
Re: Suggested magic for "a" .. "b"
Side note: you could get around some of the problems, below, but in order to do so, you would have to exhaustively express all of Unicode using the Str builtin module's RANGES constant. In fact, as it is now, it defines ASCII lowercase, but doesn't define Latin lowercase. Presumably because doing so would be a massive pain. Again, I'll point out that using script and properties is much easier On Tue, Jul 20, 2010 at 10:35 PM, Solomon Foster wrote: > > Sorry, didn't mean to imply the series operator was perfect. (Though > it is surprisingly awesome in general, IMO.) Just that the right > questions would be about the series operator rather than Ranges. > So, what's the intention of the range operator, then? Is it just there to offer backward compatibility with Perl 5? Is it a vestige that should be removed so that we can Huffman ... down to ..? I'm not trying to be difficult, here, I just never knew that ... could operate on a single item as LHS, and if it can, then .. seems to be obsolete and holding some prime operator real estate. > > The questions definitely look different that way: for example, > ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz is easily and > clearly expressed as > >'A' ... 'Z', 'a' ... 'z' # don't think this works in Rakudo yet :( > I still contend that this is so frequently desirable that it should have a simpler form, but it's still going to have problems. One example: for expressing "Katakana letters" (I use "letters" in the Unicode sense, here) it's still dicey. There are things interspersed in the Unicode sequence for Katakana that aren't the same thing at all. Unicode calls them lowercase, but that's not quite right. They're smaller versions of Katakana characters which are used more as punctuation or accents than as syllabic glyphs the way the rest of Katakana is. I guess you could write: ア, イ, ウ, エ, オ, カ ... ヂ,ツ ...モ,ヤ, ユ, ヨ ... ロ, ワ ... ヴ (add quotes to taste) But that seems quite a bit more painful than: ア .. ヴ (or ... if you prefer) Similar problems exist for many scripts (including some of Latin, we're just used to the parts that are odd), though I think it's possible that Katakana may be the worst because of the mis-use of Ll to indicate a letter when the truth of the matter is far more complicated. > That suggests to me that the current behavior of 'A' ... 'z' is pretty > reasonable. > You still have to decide to make at least some allowances for invalid codepoints and I think you should avoid ever generating a combining or modifying codepoint in such a sequence (e.g. "Ѻ" ... "Ҋ" in Cyrillic which contains several combining characters for currency and counting as well as one undefined codepoint). -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: Suggested magic for "a" .. "b"
g convoluted, even that ugly example does something useful and, I dare say, intuitive for testing membership. Here's the pseudo-code for my suggestion: class SingleCharAlphaRange { has $.start; has $.end; # Verify that this is a single character string which is valid # and non-combining/non-modifying and represented by # one and only one codepoint. method valid(Str $s --> Bool) { # Assert that this is valid Unicode, 1 codepoint string which is not a # combining or modifying codepoint } # Is $s in this range? method in-range(Str $s --> Bool) { return fail() unless self.valid($s); # "abc" ~~ "a" .. "z" return True if self.start eq $s or self.end eq $s; # "a" ~~ "a" .. "z" return False if $s.ord < self.start.ord; # "a" ~~ "b" .. "z" return False if $s.ord > self.end.ord; # "z" ~~ "a" .. "y" my @props-a = self.props_start; # get script and properties for $.start my @props-b = self.props_end; # ' ' $.end my @props-s = self.props($s); # ' ' $s if @props-a ~~ @props-s or @props-b ~~ @props-s { return True; "b" ~~ "a" .. "z" } } ... method list() { gather do { for self.start.ord .. self.end.ord -> $i { take chr($i) if self.inrange(chr($i)); } } } 2) otherwise, call .succ on the LHS. Stop before the generated values > exceed the RHS. > Isn't that what you do when you try to listify a range? A range doesn't do any of that unless you try to walk it. What happens when you ask if a value is in the range "AA" .. "zz"? Do you iterate through every possible value and then return false if nothing matched? > I'm not convinced it should be any more complicated than that. If you have an idea that makes it simpler, I'm all ears. But I don't see anything that makes it simpler than my suggestion FOR THE USER. For us, it can be absurdly complex, but I would hope we could keep it simple for the user. PS: Notice that 5..1 would have to be 5,4,3,2,1 for this proposal to really make sense, which I believe it needs to. After all, currently (1..1e10).reverse will blow up, and there's no really good way around that. It would be much simpler to just be able to say "1e10 .. 1" and there's not really a reason I can think of not to that doesn't boil down to "people expect that to fail because of Perl 5." PPS: Other unexpected results in Rakudo, all related to the behavior that Rakudo seems to have around ranges that it doesn't think are legitimate for ranges: it repeats the LHS infinitely: "䷀" .. "䷿" - expected: all hexagram characters; got: first character, infinitely repeating. "鐀" .. "鐅" - expected: all CJK Unified Ideographs between u+9400 and u+9405; got: first character, infinitely repeating. "٠" .. "٩" - expected: all Arabic-Indic digits zero through nine; got: first digit (zero) repeating (note: bidi may confuse display in this email) "א" .. "ת" - expected: all Hebrew letters; got: first character (א) repeating (note: bidi may confuse display in this email) "A" .. "E" - expected: all full width, capital letters A through E; got: full width A repeating. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: Suggested magic for "a" .. "b"
On Fri, Jul 16, 2010 at 3:49 PM, Carl Mäsak wrote: > Aaron (>): > > [...] > > > > Many useful results from this suggested change: > > > > "C" .. "A" = (Rakudo: <>) > > Regardless of the other traits of your proposed semantics, I think > permitting reversed ranges such as the one above would be a mistake. > Why are you calling that a "reversed range"? It's not reversed, it's a range like any other. The ordering of the terminator elements is only interesting if you start pulling elements out. As a range, ordering isn't really significant. > Rakudo gives the empty list for ranges whose lhs exceeds (fsvo > "exceeds") its rhs, because that's the way ranges work in Perl. The > reason ranges work that way in Perl (in my understanding) is that it's > the less surprising behavior when the endpoints are determined at > runtime. > In Perl 5, if that's what you mean, "C" .. "A" produces the letters from C to Z. I have no rational explanation for why, but I suggest we avoid emulating this behavior in Perl 6. > For explicitly specifying a reverse list of characters, there's still > `reverse "A" .. "C"`, which is not only a straightforward idiom and > huffmanized about right, but also good documentation for the reader. > reverse("A" .. "C") is not the same as "C" .. "A". Observe: $ ./perl6 -e 'say reverse("A" .. "C").perl' ["C", "B", "A"] $ ./perl6 -e 'say ("A" .. "C").perl' "A".."C" In order for reverse to work lazily, it would have to add a wrapper to the iterator that asked for its last element first, and it's not clear to me that one CAN ask for an iterators last element without unrolling it. For single characters, that's not TOO bad, but for strings.elems > 1 you could blow out your RAM on even fairly trivial strings. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: Suggested magic for "a" .. "b"
On Fri, Jul 16, 2010 at 9:40 PM, Michael Zedeler wrote: > > What started it all, was the intention to extend the operator, making it > possible to evaluate it in list context. Doing so has opened pandoras box, > because many (most? all?) solutions are inconsistent with the rule of least > surprise. > I don't think there's any coherent expectation, and therefore no potential to avoid surprise. Returning comic books might be more of a surprise, but as long as you're returning a string which appears to be "in the range" expressed, then I don't see surprise as the problem. > > For instance, when considering strings, writing up an expression like > > 'goat' ~~ 'cow' .. 'zebra' > > This makes sense in most cases, because goat is lexicographically between > cow and zebra. This presumes that we're treating a string as a "number" in base x (where x, I guess would be the number of code points which share ... what, any of the general category properties of the components of the input strings? That begins to get horrendously messy very, very fast: say "1aB" .. "aB1" > I'd suggest that if you want to evaluate a Range in list context, you may > have to provide a hint to the Range generator, telling it how to generate > subsequent values. Your suggestion that the expansion of 'Ab' .. 'Be' > should yield is just an example of a different > generator (you could call it a different implementation of ++ on Str types). > It does look useful, but by realizing that it probably is, we have two > candidates for how Ranges should evaluate in list context. > I think the solution here is to evaluate what's practical in the general case. Your examples are, I think misleading because they involve English words and we naturally leap to "sure, that one's in the dictionary between the other two." However, let me pose this dictionary lookup for you: "cliché" ~~ "aphorism" .. "truth" Now, you see where this is going? What happens when we throw in some punctuation? "father-in-law" ~~ "dad" .. "stranger" The problem is that you have a complex heuristic in mind for determining membership, and a very simple operator for expressing the set. Worse, I haven't even gotten into dealing with Unicode where it's entirely reasonable to write "TOPIXコンポジット1500構成銘柄" which I shamelessly grabbed from a Tokyo Stock Exchange page. That one string, used in everyday text, contains Latin letters, Hiragana, Katakana, Han or Kanji idiograms and Latin digits. Meanwhile, back to ".." ... the range operator. The most useful application that I can think of for strings of length > 1 is for generating unique strings such as for mktemp. Beyond that, its application is actually quite limited, because the rules for any other sort of string that might make sense to a human are absurdly complex. As such, I think it suffices to say that, for the most part, ".." makes sense for single-character strings, and to expand from there, rather than trying to introduce anything more complex. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: Suggested magic for "a" .. "b"
On Fri, Jul 16, 2010 at 1:14 PM, yary wrote: > There is one case where Rakudo's current output makes more sense then > your proposal, and that's when the sequence is analogous to a range of > numbers in another base, and you don't want to start at the equivalent > of '' or end up at the equivalent of ''. If you want a range of numbers, you should be using numbers. Perl should absolutely not try to guess that you want codepoints to appear in your result set which were not either expressed in the input or fall between the range of any two corresponding input codepoints. > But that's a less > usual case and there's a workaround. Using your method & example, "Ab" > .. "Az", "Ba" .. "Be" would reproduce what Rakudo does now. > Quite true. > > In general, I like it. Though it does mean that the sequence generated > incrementing "Ab" repeatedly will diverge from "Ab" .. "Be" after 4 > iterations. > Also true, and I think that's a correct thing. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Suggested magic for "a" .. "b"
string elements concatenated. This might even be iterated for every additional codepoint in the longer string. For example: "a" .. "bcd" = "..." could have similar semantics. In the case of A, B ... C, for length 1 strings, the range A .. B is simply projected forward to until x ge C (if A..B is increasing, le otherwise). C's properties probably should not be considered at all. In the case of length > 1 strings each character index is projected forward independently until any one character index ge the corresponding index in the terminator, and there is no "counting": "AAA", "BCD" ... "GGG" = If any index in the sequence does not increment (e.g. "AA", "AB" ... "ZZ") then there is an implication that counting is required. You should be able, in this case, to imply incrementing the left or right side as most significant (e.g. "AA", "BA" ... "ZZ" is also valid). It is, however, an error to try to increment indexes in any other ordering (e.g. "AAA", "ABA" ... "ZZZ"). Once a counting sequence has been established, lookahead must be employed to determine the extent of the range (e.g. "A", "B" can continue through all "Latin" Lu codepoints, so in order to know when to cycle, you must determine how many codepoints lie in the full range. This implies that length > 1 strings in "..." operations which imply a counting sequence, are not strictly evaluated lazily, though some laziness may still be employed. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: r31651 -[S13] try to make multisig semantics slightly more generic so sigs can do better pattern matching
On Mon, Jul 12, 2010 at 7:59 PM, wrote: > Author: lwall > Date: 2010-07-13 01:59:37 +0200 (Tue, 13 Jul 2010) > New Revision: 31651 > > Modified: > docs/Perl6/Spec/S13-overloading.pod > Log: > [S13] try to make multisig semantics slightly more generic so sigs can do > better pattern matching > ... > +are not required to all bind the same set of formal variable names, > +nor are all parameters of a given name required to bind with the > +same type. Unbound parameters will be born with an undefined value > +(even if they have a default). For any parameter that occurs in > +multiple signatures with non-identical nominal types, the actual > +lexical variable will declared > "will *be* declared"? -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: S06 -- grammatical categories and macros
On Wed, Jun 30, 2010 at 7:33 PM, Jonathan Worthington wrote: > Aaron Sherman wrote: > >> See below for the S06 section I'm referring to. >> >> I'm wondering how we should be reading the description of user-defined >> operators. For example, "sub infix:<(c)>" doesn't describe >> the precedence level of this new op, so how is it parsed? Is there a >> default? >> >> >> > The default is same as infix:<+> for infix ops, however the is prec trait > (and some other related ones) should also be available (but not yet > implemented in Rakudo). That's good stuff. Is that in the synopses? I swear, I'm going to get around to writing a full index for those things ;-) > > The method case makes no sense to me. It almost certainly won't be any use > unless the method gets exported, since operator dispatches are always sub > dispatch. Maybe that example is a fossil that should go away. And if not, > then yes, it most certainly would want to be written in terms of self, not > have a parameter. So something is wonky with the spec here. > > Hope this helps a little, > > Yeah, that definitely helps, thanks! Anyone want to jump in on the macro stuff? Or is that really just waiting on an implementation to hammer out the details? -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
S06 -- grammatical categories and macros
no fixed string that can be recognized, such as tokens beginning with digits. Such an operator *must* supply an is parsed trait. The Perl grammar uses a default subrule for the :1st, :2nd, :3rd, etc. regex modifiers, something like this: sub regex_mod_external:<> ($x) is parsed(token { \d+[st|nd|rd|th] }) {...} Such default rules are attempted in the order declared. (They always follow any rules with a known prefix, by the longest-token-first rule.) Although the name of an operator can be installed into any package or lexical namespace, the syntactic effects of an operator declaration are always lexically scoped. Operators other than the standard ones should not be installed into the * namespace. Always use exportation to make non-standard syntax available to other scopes. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: Filesystems and files [Was: Re: The obligation of free stuff: Google Storage]
On Wed, Jun 30, 2010 at 4:29 AM, Richard Hainsworth wrote: It is normally implied that a program already has a 'local' environment, > including a 'local' filesystem. Thus the syntax > my $fn = open('/path/to/directory/filename', :r) or die $!; > implies a local file sytem. > > The idea of an implied local system suggests an implied local environment. > The contents of %*ENV and @*INC seem to be assumed to be local, thought this > is not specified. Given the development of the internet, this is an > assumption I think should be made implicit, as well as the mechanism for > adding remote resources via paths through a network. > > Would it make sense to define $*FS as the implied local file system, and > thus that a bare 'open' is sugar for > my $fh = $*FS.open('/path/to/directory/filename', :r); > Yep, that makes perfect sense. Once I have a working VFS object that could be stored in there, that's probably the best way to go, unless someone proposes another way between now and then. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: Perl 6 in non-English languages
I should point out that I've had a great deal of coffee. The technical details of what I've said are reasonable, but read the rest as off-the-cuff opinion. It's also true that seeing how Perl 6 would look/work when re-cast in the grammatical conventions of another human language would be very cool, even if I might take exception to its proposed use.
Re: Perl 6 in non-English languages
On Tue, Jun 22, 2010 at 6:34 PM, SundaraRaman R wrote: > > Currently, since Perl 6 (afaik) supports Unicode identifiers, the only > place > a modification is required would be in the keywords. Here's the relevant bits from S02: The currently compiling Perl parser is switched by modifying one of the braided languages in COMPILING::<%?LANG>. Lexically scoped parser changes should temporize the modification. Changes from here to end-of-compilation unit can just assign or bind it. In general, most parser changes involve deriving a new grammar and then pointing one of the COMPILING::<%?LANG>entries at that new grammar. Alternately, the tables driving the current parser can be modified without derivation, but at least one level of anonymous derivation must intervene from the preceding Perl grammar, or you might be messing up someone else's grammar. Basically, the current set of grammars in %?LANG has to belong only to the current compiling scope. It may not be shared, at least not without explicit consent of all parties. No magical syntax at a distance. Consent of the governed, and all that. So you could, for example, derive your grammar from the Perl compiler's grammar and define your own rule for logical and, using the keyword "florgle" instead of "and". There's more complexity to it than that, when I'll touch on, below, however. However, it is not a > simple matter of substituting s/keyword/local_language_keyword/ since the > resultant phrasing might be awkward or meaningless and unreadable (examples > in the linked discussion). It requires reordering the syntax of each > construct. > I can't think of any case where that would be reasonable. Perl may have taken some inspiration from English in terms of features like postfix conditionals, but the structure of the language has nothing to do with what the keywords mean in any given language. "and" could just as easily be "florgle" (pardon my language), the construct would still be: if $a florgle $b florgle $c { die "Mega florgle" } If your language doesn't do infix conjunctions, and instead lists the relationships between items at the end of a sentence, re-iterating the items whose relationships are being established, that's not a reason for Perl to then be re-structured as: if $a, $b, $c florgle $a, $b, $c { die "Mega florgle" } Worse, what would that mean for hyperoperators and other meta operations? More importantly localized dialects of Perl 6 face some substantial challenges even without re-structuring. Operators (remember, Perl 6 doesn't have keywords in the traditional sense) have a meta-meaning in Perl 6, and if I write: multi sub infix:(MyType $a, MyType $b) { $a.truth and $b.truth } then I expect that to affect what happens when you try to perform a logical union on two values of type MyType. There are two options when localizing: 1) You can change the grammar, but leave the operator's definition alone (e.g. the action for that operator calls infix: regardless of what the name in the grammar is), or 2) You can change both the grammar and the operator. If you do the former, then my "and" still works, but you can't define a "multi sub infix:" and have it do what you expect. If you do the latter then your "multi sub infix:" would work, but mine won't (and if it's library code, then you have a problem where bugs are now localization-specific. Ugh. Moving on to more general theories on the matter, I believe that localized dialects of programming languages are always a bad idea. Choosing a spoken/written language to base a programming language on is always tricky (I would have voted for Japanese for Perl 6, even being an English speaker), but once that choice is made, the resulting programming language gives speakers of any arbitrary language an opportunity to interact with the developers from every culture in the world, simply by learning the structured conventions of the programming language (and quite possibly NOT the language from which it takes its cues). If you choose, instead to program in "Swahili Perl 6" then only people who read Swahili will be able to tell what it was you were talking about, whereas speakers of every language on Earth will know what you meant when you write vanilla Perl 6. Of course, education is often brought up in these discussions. I consider this a red herring. The United States is particularly prone to using localized versions of international symbols (e.g. meter), and this proves confusing when interacting with the rest of the world. Take this to an extreme, and we'd be taught to write "907 thousandsmallweights to the short ton" rather than "907 kilograms" and that's just not going to help anyone (yes, I'm aware that Brits try to spell that grammes, and I r
Re: The obligation of free stuff: Google Storage
First off, I again have to caution that this thread is conflating "open" with filesystem interaction. While open is one of many ways of interacting with a filesystem, it isn't even remotely sufficient (nor my immediate focus). One can ask for and modify filesystem metadata, security information, and so on as well as that for individual objects within the filesystem (which in the POSIX model is mostly files and directories). In a traditional POSIX/Unix model, programs (other than key OS utilities) don't usually do much with the structure of the filesystem. That's meant as an interactive task for an admin. However, in building a cloud-storage aware VFS-layer, managing the filesystem in terms of layout, allocation, security (access methods, authorization and authentication), payment models, and many other features are expected to be embodied in the access model. Just as an example, choosing and laying out what Amazon calls "buckets" is the equivalent of partitioning. That does need an interface. Now, we can just translate the Python bindings for Google Storage (and I believe there are already Perl 5 bindings for Amazon S3), but my inclination is to build a generic VFS that can handle POSIX-like filesystems as well as everything else from Windows/Mac specific features to full-blown cloud storage to more user-oriented storage options (Dropbox comes to mind). Every addressable storage model which could be treated as a filesystem should have a place in the Perl 6 VFS. Now, as to the question of overloading "open"... I'm not sure. I mean, it's pretty easy to say: URI.new($path).open(:ro) or open(URI.new($path), :ro) When what you want is a VFS object, and I kind of like the idea of the standard open on a string having POSIX semantics. Now, to your question, C.J. On Fri, Jun 18, 2010 at 3:03 PM, C.J. Adams-Collier wrote: > Define "opening a file" for me. Is it something that's associated with a > filehandle, by definition? Do TCP sockets count? Opening a file isn't a well defined operation. You have to be more specific. In your question you're conflating the evaluation of a filesystem namespace token (which is one of many possible modes of filesystem interaction), returning a filehandle object that represents access to the named object with evaluation of a socket namespace token and returning a similar filehandle object that represents that object. There are, of course crossovers (filesystem pipes) and other operations that yield filehandle objects (various IPC operations that aren't exactly sockets, for example). Now, if you want to unify some of that territory, you can build a VFS layer. Today, that's often done via URIs, just because they're handy for Identifying Universal Resources, but what happens when you call open (or equivalent) on such a token is still an open question, and I didn't seek to answer it in this thread. Franky, I don't think that it's something that SHOULD be answered prior to building the VFS layer itself, because that layer might dictate some design decisions, but my high level impulse is to say that open on a VFS token (be it a URI or some othe complex data) will yield a VFS-back-end specific "handle". Such a handle would likely "do" IO::Handle and friends as well as something VFS-specific. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: r31337 -[S02] allow _ between radix and digits as suggested by ajs++
On Fri, Jun 18, 2010 at 10:26 PM, ajs wrote: > Attached, I've included test results, the tests and the patch (both to the > spectest suite and nqp-rx) to support this spec change. No... no I didn't. Here it is, attached as text. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs + /usr/bin/perl t/harness --keep-exit-code --icu=1 --jobs t/spec/S02-literals/underscores.t t/spec/S02-literals/underscores.t .. ok All tests successful. Files=1, Tests=31, 2 wallclock secs ( 0.00 usr 0.08 sys + 1.23 cusr 0.70 csys = 2.01 CPU) Result: PASS + svn diff t/spec Index: t/spec/S02-literals/underscores.t === --- t/spec/S02-literals/underscores.t (revision 31371) +++ t/spec/S02-literals/underscores.t (working copy) @@ -13,7 +13,7 @@ =end pod -plan 19; +plan 31; is 1_0, 10, "Single embedded underscore works"; @@ -51,4 +51,17 @@ dies_ok { 2._123 },"2._123 parses as method call"; dies_ok { 2._e23 },"2._23 parses as method call"; +is 0b110, 111, "0b for base 2 parses"; +is 0b110_, 111, "0b for base 2 with underscore parses"; +is 0b_110_, 111, "0b for base 2 with underscore after radix parses"; +is 0o157, 111, "0o for base 8 parses"; +is 0o1_57, 111, "0o for base 8 with underscore parses"; +is 0o_1_57, 111, "0o for base 8 with underscore after radix parses"; +is 0d111, 111, "0d for base 10 parses"; +is 0d1_11, 111, "0d for base 10 with underscore parses"; +is 0d_1_11, 111, "0d for base 10 with underscore after radix parses"; +is 0x6f, 111, "0x for base 16 parses"; +is 0x6_f, 111, "0x for base 16 with underscore parses"; +is 0x_6_f, 111, "0x for base 16 with underscore after radix parses"; + # vim: ft=perl6 + git diff diff --git a/src/HLL/Grammar.pm b/src/HLL/Grammar.pm index 2f486ee..82940dc 100644 --- a/src/HLL/Grammar.pm +++ b/src/HLL/Grammar.pm @@ -59,10 +59,10 @@ grammar HLL::Grammar; token integer { [ -| 0 [ b -| o -| x -| d +| 0 [ b _? +| o _? +| x _? +| d _? ] | ]
Re: r31337 -[S02] allow _ between radix and digits as suggested by ajs++
Attached, I've included test results, the tests and the patch (both to the spectest suite and nqp-rx) to support this spec change. On Thu, Jun 17, 2010 at 7:49 PM, wrote: > Author: lwall > Date: 2010-06-18 01:49:13 +0200 (Fri, 18 Jun 2010) > New Revision: 31337 > > Modified: > docs/Perl6/Spec/S02-bits.pod > Log: > [S02] allow _ between radix and digits as suggested by ajs++ > > > Modified: docs/Perl6/Spec/S02-bits.pod > === > --- docs/Perl6/Spec/S02-bits.pod 2010-06-17 21:54:34 UTC (rev 31336) > +++ docs/Perl6/Spec/S02-bits.pod 2010-06-17 23:49:13 UTC (rev 31337) > @@ -3042,6 +3042,8 @@ > > A single underscore is allowed only between any two digits in a > literal number, where the definition of digit depends on the radix. > +(A single underscore is also allowed between a radix prefix and a > +following digit, as explained in the next section.) > Underscores are not allowed anywhere else in any numeric literal, > including next to the radix point or exponentiator, or at the beginning > or end. > @@ -3056,6 +3058,13 @@ > 0d base 10, digits 0..9 > 0x base 16, digits 0..9,a..f (case insensitive) > > +Each of these allows an optional underscore after the radix prefix > +but before the first digit. These all mean the same thing: > + > + 0xbadcafe > + 0xbad_cafe > + 0x_bad_cafe > + > =item * > > The general radix form of a number involves prefixing with the radix > > -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
underscore in numbers
The spec says, and NQP seems to implement (Rakudo, I think, picks up from NQP as defined in HLL-s0.pir, is this right?) that a single underscore is ignored between any two digits in a number, not between the radix and the number. However, it seems to me that this would be very handy: 0b__ instead of 0b_ 0x_dead_beef instead of 0xdead_beef 0d_1_000_000 instead of 0d1_000_000 ... just in terms of the clarity gain from keeping the radix at arms length from the rest of the integer literal. Just a thought. I tried this change against parrot to test it out, but it didn't seem to make any difference, so I must be missing something: $ svn diff parrot/ext/nqp-rx/src/stage0/HLL-s0.pir Index: parrot/ext/nqp-rx/src/stage0/HLL-s0.pir === --- parrot/ext/nqp-rx/src/stage0/HLL-s0.pir (revision 47640) +++ parrot/ext/nqp-rx/src/stage0/HLL-s0.pir (working copy) @@ -2759,6 +2759,13 @@ substr $S10, rx184_tgt, $I11, 1 ne $S10, "b", rx184_fail add rx184_pos, 1 +add $I11, rx184_pos, 1 +gt $I11, rx184_eos, rx184_nobunder +sub $I11, rx184_pos, rx184_off +substr $S10, rx184_tgt, $I11, 1 +ne $S10, "_", rx184_nobunder +add rx184_pos, 1 + rx184_nobunder: # rx subrule "binint" subtype=capture negate= rx184_cur."!cursor_pos"(rx184_pos) $P10 = rx184_cur."binint"() @@ -2778,6 +2785,13 @@ substr $S10, rx184_tgt, $I11, 1 ne $S10, "o", rx184_fail add rx184_pos, 1 +add $I11, rx184_pos, 1 +gt $I11, rx184_eos, rx184_noounder +sub $I11, rx184_pos, rx184_off +substr $S10, rx184_tgt, $I11, 1 +ne $S10, "_", rx184_noounder +add rx184_pos, 1 + rx184_noounder: # rx subrule "octint" subtype=capture negate= rx184_cur."!cursor_pos"(rx184_pos) $P10 = rx184_cur."octint"() @@ -2797,6 +2811,13 @@ substr $S10, rx184_tgt, $I11, 1 ne $S10, "x", rx184_fail add rx184_pos, 1 +add $I11, rx184_pos, 1 +gt $I11, rx184_eos, rx184_noxunder +sub $I11, rx184_pos, rx184_off +substr $S10, rx184_tgt, $I11, 1 +ne $S10, "_", rx184_noxunder +add rx184_pos, 1 + rx184_noxunder: # rx subrule "hexint" subtype=capture negate= rx184_cur."!cursor_pos"(rx184_pos) $P10 = rx184_cur."hexint"() @@ -2814,6 +2835,13 @@ substr $S10, rx184_tgt, $I11, 1 ne $S10, "d", rx184_fail add rx184_pos, 1 +add $I11, rx184_pos, 1 +gt $I11, rx184_eos, rx184_nodunder +sub $I11, rx184_pos, rx184_off +substr $S10, rx184_tgt, $I11, 1 +ne $S10, "_", rx184_nodunder +add rx184_pos, 1 + rx184_nodunder: # rx subrule "decint" subtype=capture negate= rx184_cur."!cursor_pos"(rx184_pos) $P10 = rx184_cur."decint"() -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: The obligation of free stuff: Google Storage
On Wed, Jun 9, 2010 at 10:04 AM, Aaron Sherman wrote: > Has anyone begun to consider what kind of filesystem interface we want > for things like sftp, Amazon S3, Google Storage and other remote > storage possibilities? Is there any extant work out there, or should I > just start spit-balling? In the absence of anything forthcoming and a totally arbitrary sense of urgency ;-) here's what I think I should do: IO::FileSystems (S32) gives us some basics and the Path role also provides some useful features. I will start there and build an IO::FileSystems::VFS roughly like: class IO::VFS is IO::FileSystems { ... # Session data if applicable has IO::VFS::Session $.session; # Many methods take a $context which, if supplied # will contain back-end specific data such as restart markers # or payment model information. I'll probably define # a role for the context parameter, but otherwise # leave it pretty loose as a back-end specific structure. # A simple operation that guarantees a round-trip to the filesystem method nop($context?) { ... } # list of sub-IO::VFS partitions/buckets/etc method targets($context?) { ... } method find_target($locator, $context?) { ... } # Means of acquiring file-level access through a VFS method find($locator, $enc = $.session.encoding, $context?) { ... } method glob(Str $matcher, $enc = $.session.encoding, $context?) { ... } # Like opening and writing to filehandle, but the operation is totally # opaque and might be a single call, senfile or anything else. # Note that this doesn't replace $obj.find($path).write(...) method put($locator, $data, $enc = $.session.encoding, $context?) { ... } # Atomic copy/rename, etc. are logically filesystem operations, even though # they might have counterparts at the file level. The distinction being that # at the filesystem level I never know nor care what the contents of the # file are, I just ask for an operation to be performed on a given path. method copy($from, $to, $enc = $.session.encoding, $context?) { ... } method rename($from, $to, $enc = $.session.encoding, $context?) { ... } method delete($locator, $enc = $.session.encoding, $context?) { ... } # service-level ACLs if any method acl($locator, $context?) { ... } } The general model I imagine would be something like: my IO::VFS::S3 $s3 .= new(); $s3.session.connect($amazonlogininfo); my $bucket = $s3.find_target($bucket_name); $bucket.put("quote.txt", "Now is the time for all good men...\n"); say "URI: ", $bucket.find("quote.txt").uri; or my IO::VFS::GoogleStorage $goog .= new(); $goog.session.connect($googlelogininfo); my $bucket = $goog.find_target($bucket_name); $bucket.put("quote.txt", "Now is the time for all good men...\n"); say "URI: ", $bucket.find("quote.txt").uri; or my IO::VFS::SFTP $sftp .= new(); $sftp.session.connect(:host, :user, :password); my $filesystem = $sftp.find_target("/tmp"); $filesystem.put("quote.txt", "Now is the time for all good men...\n"); say "URI: ", $filesystem.find("quote.txt").uri; # using sftp:... Notice that everything after $obj.session.connect is identical except for my choice of variable names. In fact, you shouldn't have to worry about what storage back-end you're using as long as you have a valid VFS handle. Really path names are the only thing that might trip you up. Thoughts? I think that in order to do this, I'll need the following support libraries which may or may not exist (I'll be looking into these): IO::FileSystems Path HTTP (requires socket IO, MIME, base64, etc.) Various crypto libs I don't intend to provide a finished implementation of any of these where they don't already exist (I may not even end up with a final implementation of the VFS layer), but at least I'll get far enough along that others who want to work on this will have a starting point, and I'll want to at least have a test that fakes its way all the way down to creating a remote file on all three services, even if most of the operations involve passing on blobs of data generated by equivalent calls in other languages. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
The obligation of free stuff: Google Storage
On a lark, I submitted a request to Google for membership in the Google Storage beta on the basis of doing something virtual filesystemish for Perl 6. The bastards gave me an account, so now I feel as if I should do something. Has anyone begun to consider what kind of filesystem interface we want for things like sftp, Amazon S3, Google Storage and other remote storage possibilities? Is there any extant work out there, or should I just start spit-balling? -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: r31043 -[S32/Containers] Buf does Stringy, too
On Wed, Jun 2, 2010 at 2:59 PM, Jason Switzer wrote: > On Wed, Jun 2, 2010 at 5:10 AM, wrote: > > > > > -class Buf does Positional {...} > > +class Buf does Positional does Stringy {...} > > > > I never really thought about this, but now that I see it here, it made me > realize that how 'does' works seems verbose. I think we should be able to > specify a list instead of a bunch of 'does' statements. For example, the > above example should be written as > > class Buf does Positional, Stringy { ... } > > Pro: * Shorter can be good * It's pretty clear what's going on. Con: * Composition is complicated. Explicit "does foo" calls that out * Something like: class Buf does Positional does Stringy { ... } ... looks to me like a laundry list of what I need to be aware of when considering this class's uses, brace style preferences notwithstanding. My knee-jerk response would be that this is fine the way it is now, but perhaps adding your suggestion as an alternative syntax could be considered for >6.0? Then again, no one cares what I say ;-) -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: r31050 -[S03] refine hyper dwimminess to be more like APL, with modular semantics
On Wed, Jun 2, 2010 at 12:51 PM, wrote: > + > +(1,2,3,4) »+« (1,2) # always error > +(1,2,3,4) «+» (1,2) # 2,4,4,6 rhs dwims to ((1,2) xx *).batch(4) > +(1,2,3) «+» (1,2) # 2,4,4 rhs dwims to ((1,2) xx *).batch(3) > +(1,2,3,4) «+« (1,2) # 2,4 lhs dwims to (1,2,3,4).batch(2) > +(1,2,3,4) »+» (1,2) # 2,4,4,6 rhs dwims to ((1,2) xx *).batch(4) > +(1,2,3) »+» (1,2) # 2,4,4,6 rhs dwims to ((1,2) xx *).batch(3) > +(1,2,3) »+» 1 # 2,4,4,6 rhs dwims to (1 xx *).batch(3) > > Is there some automatic translation of these examples into tests? If not, here's what they'd be: ok(( (1,2,3,4) «+» (1,2) ) ~~ (2,4,4,6) ) ok(( (1,2,3) «+» (1,2) ) ~~ (2,4,4) ) ok(( (1,2,3,4) «+« (1,2) ) ~~ (2,4) ) ok(( (1,2,3,4) »+» (1,2) ) ~~ (2,4,4,6) ) ok(( (1,2,3) »+» (1,2) ) ~~ (2,4,4,6) ) ok(( (1,2,3) »+» 1) ~~ (2,4,4,6) ) I tested these all with Rakudo, and they all currently fail, though I guess that's not shocking. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Advocating Perl 6
As the time nears, I figured some buzz was in order, and to help with that, I'm Buzzing about Perl 6. If you would like to follow me / reshare / comment, you can go here: http://www.google.com/profiles/AaronJSherman#buzz My current goal is to post a short snippet of Perl 6 code with an equally brief explanation every day. We'll see how long I can keep it up. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: Parallelism and Concurrency was Re: Ideas for a"Object-Belongs-to-Thread" (nntp: message 4 of 20) threading model (nntp: message 20 of 20 -lastone!-)
[Note: removed one CCer because the email address was long and complex and looked like my mail client had hacked up a hairball full of experimental Perl 6 obfuscation. My apologies if that wasn't actually a mail failure] On Mon, May 17, 2010 at 3:13 PM, wrote: > > The important thing is not the number of algorithms: it's the number >> programs and workloads. >> > > From that statement, you do not appear to understand the subject matter of > this thread: Perl 6 concurrency model. > > That seems a tad more confrontational than was required. It's also arguably incorrect. Surveying existing software implementations and code bases is not precluded, simply because we're talking about a new(ish) language. For CPU-bound processes, there is no benefit in trying to utilise more than > one thread per core--or hardware thread if your cores have hyper-threading. > Context switches are expensive, and running hundreds (let alone thousands or > millions) of threads on 2/4/8/12 core commodity hardware, means that you'll > spend more time context switching than doing actual work. With the net > result of less rather than more throughput. > I know that you know what I'm about to say, but I'm going to say it anyway just so we're standing on the same ground. When I was in college, I had access to a loosely coupled 20-processor system. That was considered radical, cutting-edge technology for the fact that you could treat it like a standard Unix workhorse, and not as some sort of black-hole of computing power with which you could commune via a front-end (ala Cray). I then worked for a company that was producing an order 1k processor system to do the same thing. These were relatively short-spaced advances in technology. When a single die shipped, containing 2 cores, I was agape. I'd never considered that it would happen as soon as it did. Today we're putting order of 10 cores on a die. I'm really not all that old, and yet the shockingly high-end supercomputing platforms of my youth are, more or less, being put on a chip. Perl pre-6 hit its stride about 5-10 years into its lifespan (mid to late 90s). Perl 6 hasn't even shipped yet, and yet your statements appear to be selecting modern hardware as its target platform, design wise. I'm not sure that's entirely (un)wise. Then again, it does simplify the world tremendously. I just wanted to get that all out there for thought. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: Fwd: URI replacement pseudocode
On Mon, May 17, 2010 at 3:34 PM, Moritz Lenz wrote: > > > Aaron Sherman wrote: > > > I had a hard time even getting basic code working like: > > > > token foo { blah } > > if "blah" ~~ m// { say "blah!" } > > > > (See my question to the list, last week) > > Right. What works today is > > grammar Foo { > token TOP { } > token foo { blah } > } > > if Foo.parse('blah') { > say "yes" > } > > I will do this. Thanks. >> * Don't inherit from roles, implement them with 'does' > >> > > > > I did that, didn't I? Did I typo something? > > > >grammar URI::rfc2396 does URI::Grammarish ... > > > > and > > grammarb URI::rfc3986_regex is URI::Grammarish > > that's what I meant > That's a double typo (grammarb and "is"). I'll fix that in the version I put up after this discussion. it's called #perl6, and is our IRC channel :-) > Writing down such volatile information isn't very useful, because it > becomes outdated rather quickly. > I used to be active in #perl6. I'll try to jump back in. I'm noting the rest of what you said and moving forward with the changes. It all sounds much more reasonable than I feared it would be. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: Fwd: URI replacement pseudocode
On Mon, May 17, 2010 at 3:34 PM, Moritz Lenz wrote: > > > Aaron Sherman wrote: > > > I had a hard time even getting basic code working like: > > > > token foo { blah } > > if "blah" ~~ m// { say "blah!" } > > > > (See my question to the list, last week) > > Right. What works today is > > grammar Foo { > token TOP { } > token foo { blah } > } > > if Foo.parse('blah') { > say "yes" > } > > I will do this. Thanks. >> * Don't inherit from roles, implement them with 'does' > >> > > > > I did that, didn't I? Did I typo something? > > > >grammar URI::rfc2396 does URI::Grammarish ... > > > > and > > grammarb URI::rfc3986_regex is URI::Grammarish > > that's what I meant > That's a double typo (grammarb and "is"). I'll fix that in the version I put up after this discussion. it's called #perl6, and is our IRC channel :-) > Writing down such volatile information isn't very useful, because it > becomes outdated rather quickly. > I used to be active in #perl6. I'll try to jump back in. I'm noting the rest of what you said and moving forward with the changes. It all sounds much more reasonable than I feared it would be. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: Perl6 and "accents"
On Mon, May 17, 2010 at 1:52 PM, Tom Christiansen wrote: > Exegesis 5 @ http://dev.perl.org/perl6/doc/design/exe/E05.html reads: > > # Perl 6 > / < - [A-Za-z] >+ / # All alphabetics except A-Z or a-z ># (i.e. the accented alphabetics) > >[Update: Would now need to be <+ - [A..Za..z]> to avoid ambiguity >with "Texas quotes", and because we want to reserve whitespace as the > first >character inside the angles for other uses.] > > Why isn't that: /<+ alpha - [A-Za-z]>+ / > I'm also disappointed to see perl6 spreading the notion that "accent" > is somehow a valid synonym for > >diacritical marking >diacritic marking >diacritic mark >diacritic >mark > > It's not. Accent is not a synonym for any of those. Not all marks are > accents, and not all accents are marks. > I agree that it's a rather "folksy" way of saying "them funny letters." On the other hand, I think that was the intent. It's very hard to find ways to describe Unicode spaces in ways that the average coder (not the average person, which is a small help) will grasp immediately. diacritical isn't a word that most folks know, even among programmers. "Accent" does have a colloquial meaning that maps correctly, but sadly that colloquial definition does not correspond to the technical definition, so in being clear, you become less accurate. There is, as far as I'm aware, no good middle ground, here. I think having the exegeses be more colloquial and the synopses be more technically accurate makes a fair amount of sense, though perhaps footnoting the technically inaccurate elements of the exegeses would make sense. To the question of the exegeses being out of date: if they are out of date, why are we keeping them around? Is there value there? I understand the value in keeping the apocalypses around, but that's due to their nature as the first draft of the standard. The exegeses have no such status. Personally, I'd rather see them updated than thrown out, but I tried writing examples just for a few elements of S29 back in the day, and found the moving target to be too painful. Maybe Perl 6 has slowed down enough that it's more practical now? -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Fwd: URI replacement pseudocode
Ooops, took this off-list by accident. -- Forwarded message -- From: ajs Date: Mon, May 17, 2010 at 2:59 PM Subject: Re: URI replacement pseudocode To: Moritz Lenz Thank you for your responses! On Mon, May 17, 2010 at 1:37 PM, Moritz Lenz wrote: > Aaron Sherman wrote: > > Here's the code: > > > > > https://docs.google.com/leaf?id=0B41eVYcoggK7YjdkMzVjODctMTAxMi00ZGE0LWE1OTAtZTg1MTY0Njk5YjY4&hl=en > > I think your code would benefit greatly from actually trying to get run > with Rakudo (or at least that parts that are yet implemented), as well > as from a version control system. > (re: storage. yes, I intend to get this into something. not sure what, yet. git is preferred, I presume?) I had a hard time even getting basic code working like: token foo { blah } if "blah" ~~ m// { say "blah!" } (See my question to the list, last week) so I really didn't want to venture into trying to get this working, but yeah, now that it's done I'll see how Rakudo chokes on it. > > > So, my questions are: > > > > * Is this code doing anything that is explicitly not Perl 6ish? > > Some things I've noticed: > * you put lots of subs into roles - you probably meant methods > Well... that's a fair question. What does a method mean in a grammar? I wasn't too clear on what being a method of a grammar meant. Should I be calling these as class-methods? > * Don't inherit from roles, implement them with 'does' > I did that, didn't I? Did I typo something? grammar URI::rfc2396 does URI::Grammarish ... > * the grammars contain a mixture of tokens for parsing and of > methods/subs for data extraction; yet Perl 6 offers a nice way to > separate the two, in the form of action/reduction methods; your code > might benefit from them. > Do you have a pointer for some discussion of this? I'd love to pursue it. > * class URI::GrammarType seems not very extensible... maybe keep a hash > of URI names that map to URIs, which can be extended by method calls? > The idea that I was working with was that you would provide the grammar itself when you wanted to do something custom, and the string names were just a convenience for the default cases. So, for example: my URI $privatewww .= new("ajs://perl**6", :spec(::MyURI::Spec)); Where MyURI::Spec could be any grammar that implements the URI::Grammarish interface (see grammar interface discussion, below). I can look into extending it with string names as well, though. > > * Is this style of pluggable grammar the correct approach? > > Looks good, from a first glance. > Thanks! > > > * Should I hold off until R* to even try to convert this into working > code? > > No need for that. The support for grammars and roles is pretty good, > character classes and match objects are still a bit unstable/whacky. > Is there any collected wisdom available on this? I'd love to not run around chasing my own tail trying to figure out why something doesn't work. > > * Am I correct in assuming that <...> in a regex is intended to allow the > > creation of interface roles for grammars? > > You lost me here. calls a named rule (with arguments). > Could you rephrase your question? Sure. All S05 says is "The <...>, , and special tokens have the same "not-defined-yet" meanings within regexes that the bare elipses have in ordinary code." Which doesn't tell me a lot, but seems to imply that: role blah { token bletch { <...> } } is roughly analogous to: role blah { method bletch {...} } that is to say, the role should have an interface which, when applied to a grammar, would assert the presence of a bletch token. Am I reading too much into this? If yes, is there a way to assert role-based interfaces on grammars? The main reason I wanted this was for the very parametric grammar selection we were talking about, above, where the given block says: given $type { when .does(URI::Grammarish) { $.gtype = $_ } I'm assuming, of course, that I can make such assertions about a grammar in the same way that I would make them about a class. Is this true? Have I identified an interface token/rule correctly given that that was my goal? > * I guessed wildly at how I should be invoking the match against a saved > > "token" reference: > > if $s ~~ m/^ <.$.spec.gtype.URI_reference> $/ { > > is that correct? > > probably just $s ~~ /^ $regex $/; > But what should $regex contain? I have a $.spec which contains a reference to the URI::GrammarType object whose $.gtype identifies the grammar I should be using. That grammar is guaranteed to have a URI_reference rule, so the variable is: $.spec.gtype.URI_reference
URI replacement pseudocode
Over the past week, I've been using my scant bits of nighttime coding to cobble together a pseudocode version of what I think the URI module should look like. There's already one available as example code, but it doesn't actually implement either the URI or IRI spec correctly. Instead, this approach uses a pluggable grammar so that you can: my URI $uri .= new( get_url_from_user(), :spec ) which would parse the given URL using the RFC3987 IRI grammar. By default, it will use RFC3896 to parse URIs, which does not implement the UCS extensions. It can even handle the "legacy" RFC2396 and regex-based RFC3896 variations. Here's the code: https://docs.google.com/leaf?id=0B41eVYcoggK7YjdkMzVjODctMTAxMi00ZGE0LWE1OTAtZTg1MTY0Njk5YjY4&hl=en So, my questions are: * Is this code doing anything that is explicitly not Perl 6ish? * Is this style of pluggable grammar the correct approach? * Should I hold off until R* to even try to convert this into working code? * What's the best way to write tests/package? * Am I correct in assuming that <...> in a regex is intended to allow the creation of interface roles for grammars? * I guessed wildly at how I should be invoking the match against a saved "token" reference: if $s ~~ m/^ <.$.spec.gtype.URI_reference> $/ { is that correct? * Are implementations going to be OK with massive character classes like: <+[\xA0 .. \xD7FF] + [\xF900 .. \xFDCF] + [\xFDF0 .. \xFFEF] + [\x1 .. \x1FFFD] + [\x2 .. \x2FFFD] + [\x3 .. \x3FFFD] + [\x4 .. \x4FFFD] + [\x5 .. \x5FFFD] + [\x6 .. \x6FFFD] + [\x7 .. \x7FFFD] + [\x8 .. \x8FFFD] + [\x9 .. \x9FFFD] + [\xA .. \xAFFFD] + [\xB .. \xBFFFD] + [\xC .. \xCFFFD] + [\xD .. \xDFFFD] + [\xE1000 .. \xEFFFD]> (from the IRI specification) -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: a more useful srand (was Re: r30369 - docs/Perl6/Spec/S32-setting-library)
On Mon, Apr 12, 2010 at 1:55 PM, Larry Wall wrote: > On Mon, Apr 12, 2010 at 07:24:37PM +0200, Moritz Lenz wrote: > : > 1. do all implementations of Perl6 generate the same sequence, given > the > : > same initial seed. > : > : I don't think they should. If you want that, use confuse a RNG with a > : sequence generator that it is not. > > While I agree that the default should be non-reproducable, the > approach taken in Perl 5 is nice to the extent that if you *do* > seed the built-in RNG with a consistent value, you get a reproducable > result. And reproducable trees of random sequences can be generated > by controlling the seeds of each node in the tree. > I think that what this conversation is boiling down to is: an RNG is just a role that wraps an iterator factory and provides some passable defaults, to be implemented as the default class or classes for Perl's core. That's fine, but the idea of non-reproducible defaults worries me. If, by this, you mean that whatever seed is provided to "srand" is to be aggregated with another source (e.g. XORed with output from the system's entropy source), then I would love to hear from someone who has experience with the last 10 years of PRNG work who thinks that's not opening us up to some sort of strange edge-case risk. PRNGs are often misrepresented as frivolous, but as I'm sure you know from your work at JPL, high quality random sequences are much-prized, and any language that starts off with some poor assumptions will ultimately pay for it. Some good reading for recent work: http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/ http://eprint.iacr.org/2006/086 http://lcamtuf.coredump.cx/oldtcp/tcpseq.html http://www.avatar.se/python/crng/index.html -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: underscores vs hyphens (was Re: A new era for Temporal)
On Mon, Apr 12, 2010 at 2:23 PM, Larry Wall wrote: > On Sun, Apr 11, 2010 at 03:47:18PM -0700, Darren Duncan wrote: > : > : I see that making underscores > : and hyphens to be equivalent is akin to having case-insensitive > : identifiers, where "Perl","PERL","perl" mean the same thing. Rather > : what I want is to be everything-sensitive, as AFAIK Perl 6 currently > : is; if something looks different, it is different. -- Darren Duncan > > ... > As for the horror of people having to memorize lists again or > Risk Using The Wrong Character...I'm afraid I just don't buy it. > Larry, I'm curious what you think of this example: A web page of Perl 6 documentation suggests that you should call time-local. Unfortunately, in the font that my browser uses, the height of that single stroke is ambiguous. Of course, we could have no sympathy and just say, "get a better font," but this problem will likely creep up over and over, would it not? I agree with you that this doesn't really help the person writing code from scratch, but that's not the same as a developer who is trying to interact with potentially dozens of libraries with various sources of documentation from comments to Web pages. I'd suggest the following in decreasing order of urgency: - Choose a single character (hyphen or underscore) to use in standard library code to separate the component words of an identifier (remember that underscore is only special in C-like code because it's standing in for space). - Never use dash versus underscore notationally (e.g. a-b indicates that the identifier is to be used one way vs a_b indicates otherwise) - Allow only one such character in any given identifier That last item rolls into a whole rant of mine against ambiguity in identifiers. Most often this stems from Unicode that puts the programmer in the position of having to have good enough font support to tell ambiguous names apart (and in cases like "Αpple" or "Рerl" or "Ρerl", you're just doomed regardless), but dashes and underscores are a good example of the same problem cropping up elsewhere. On the more general point, I really feel that no language should ever allow identifiers to mix Unicode blocks without strong reason. Specifically: - Underscore (or dash or whatever your notational separator is) should be the only exceptional character allowed in all identifiers - Identifiers should never include c̈ombining m̅arks - Upon scanning the first alpha character in an identifier, no further alpha characters should be allowed except as they come from the same code block or related supplemental block ("related" might be expanded to include first-class blocks in some cases to allow for combinations like Kanji (Chinese in Unicode) + Hirigana, etc.) - Upon scanning the first numeric character in an identifier, no further numeric characters should be allowed except as they come from the same code block (again, there might be some wiggle in some exceptional cases, but the goal is to avoid counting in more than one system at a time). Should all of these be hard errors? Perhaps not, but certainly they should at least yield warnings by default. PS: While I never finished the implementation of Sand, its simplistic guide to Unicode identifiers might be useful in illuminating what I'm describing above: http://www.ajs.com/ajswiki/Sand:_Syntax_and_Structure#Identifiers -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs
Re: Parsing data
On Thu, Oct 8, 2009 at 12:57 AM, David Green wrote: > I agree that being able to parse data structure would be *extremely* useful. > (I think I posted a suggestion like that at one time, though I didn't > propose any syntax.) There is already a way to parse data -- Signatures, > but not with the full power that grammars can apply to text. I'm certain I'm not the first person to think of it, especially since the link that Moritz provided came so close. Just to get a sense of the scope of it, I've been playing around with the code tonight. Modifying PGE to handle the syntactic changes was trivial, but of course, it's the semantic shift of allowing the input to a regex to be a data structure other than a string that will take some digging around and re-tooling (many assumptions need to be re-visited like what "pos" means in the context of a multi-level datastructure match!) I'll have another go at that tomorrow and see how much work it's likely to be, but I'm still thinking this is something to push forward into 6.1, and not try to hold up any work on 6.0 for. I suppose if I want to be nice to others, I should come up with a patch against the STD as well, since there's now an active project using it to compile code.
Re: Parsing data
Sorry, I accidentally took the thread off-list. Re-posting some of my comments below: On Wed, Oct 7, 2009 at 6:50 PM, Moritz Lenz wrote: > Aaron Sherman wrote: >> One of the first things that's becoming obvious to me in playing with >> Rakudo's rules is that parsing strings isn't always what I'm going to >> want to do. The most common example of wanting to parse data that's >> not in string form is the YACC scenario where you want to have a >> function produce a stream of tokenized data that is then parsed into a >> more complex representation. In similar fashion there's transformers >> like TGE that take syntax trees and transform them into alternative >> representations. >> >> To that end, I'd like to suggest (for 6.1 or whatever comes after >> initial stability) an extension to rules: > > Did you read > http://perlcabal.org/syn/S05.html#Matching_against_non-strings already? (I went off and read that, and then replied to Moritz): OK, no. That proposal only does part of the work. It would suffice for something like the lexer/parser connectivity, but transforming complex data structures would be impossible. By contrast, what I've suggested would work for both cases. It also preserves the existing ~~ /b/ functionality that we have today, and it's not entirely clear to me that the proposal that you linked to does. So, to re-cap: :data turns on data mode which treats the input as an iterator and matches each atom in the regex against objects returned by the iterator (must be rewindable? or do we cache when it's not?) Then, inside the regex we use <^...> to match complex data such as: <^~ ... > - match digits in a single element (equiv to <,> \d+ <,> in the proposal you linked), with :data turned off <^{ ... }> - smart match the return value of the closure against current element <^::identifier> - Smart match a type against the current element <^[...]> - Descend the current element, which should be iterable itself and match with :data turned on <^ name> - Same as <^[]> This should be powerful enough to match any arbitrarily nested set of iterable objects. I think it will be particularly useful against parse trees (and similar structures such as XML/HTML DOMs) and scanner productions, though users will probably find nearly infinite uses for it, much like original regular expressions.
Parsing data
One of the first things that's becoming obvious to me in playing with Rakudo's rules is that parsing strings isn't always what I'm going to want to do. The most common example of wanting to parse data that's not in string form is the YACC scenario where you want to have a function produce a stream of tokenized data that is then parsed into a more complex representation. In similar fashion there's transformers like TGE that take syntax trees and transform them into alternative representations. To that end, I'd like to suggest (for 6.1 or whatever comes after initial stability) an extension to rules: [ 'orange', 'apple', 'apple', 'pear', 'banana' ] ~~ rx :data { 'apple'+ 'pear' } Adding :data forces the match to proceed against the elements of the supplied array as a sequence, rather than as individual matches the way it behaves now. Each element of the array is matched by each atom in the expression. To support complex data (instead of matching all elements as fixed strings), a new matching syntax is proposed. The "current object" in what follows is the object in the input data which is currently being treated as an atom (e.g. an array element). It might be any kind of data such as a sub-array, number or string. <^...> matches against embedded, complex data. There are several forms depending on what comes after the ^: Forms that work on the current element of the input: ^{...} smart-matches current object against return value of closure ^~exp parses exp as a regex and matches as a string against the current object (disabling :data locally) ^::exp exp is an identifier and smart-matches on type Note that the second two forms can be implemented (though possibly not optimally) using the first. These forms treat the current element of the input as a sub-array and attempt to traverse it, leaving :data enabled: ^[exp] parses exp as a regex and matches against an array object ^ name (note space) identical to <^[]> Example: This parses a binary operator tree: token undef { <^{undef}> } token op { < + - * / > } # works because the whole object is a one-character string token term { <^::Num> | <^~ \d+ > | } # number, string with digits or undef rule binoptree { $ = [ | <^ binoptree> ] $ = [ | <^ binoptree> ] } [ '+', 5, [ '*', 6, 7 ] ] ~~ rx :data // Some notes: perhaps this should simply refer to iterable objects and not arrays? Is there a better way to unify the handling of matching against the current object vs matching against embedded structures? What about matching nested hashes? What I find phenomenal is that this requires so little change to the existing spec for rules. It's a really simple approach, but give us the ability to start applying rules in all sorts of ways we never dreamed of before. I might even tackle trying to implement this instead of the parser library I was working on if there's some agreement that it makes sense and looks like the correct way to go about it
Re: [perl #69194] rakudo 2009-08 and when with lists
On Sat, Sep 19, 2009 at 9:45 PM, David Green wrote: > On 2009-Sep-19, at 5:53 am, Solomon Foster wrote: > > The one thing that worries me about this is how :by fits into it all. >>rakudo: given 1.5 { when 1..2 { say 'between one and two' }; say >> 'not'; }; >>rakudo: given 1.5 { when Range.new(from => 1, to => 2, by => 1/3) { >> makes me very leery. I know :by isn't actually implemented yet, but what >> should it do here when it is? >> > > Exactly: 1.5 is "between" 1 and 2, but if you're counting by thirds, 1.5 is > not in the list(1..2 :by(1/3)). Sure, we can stipulate that this is simply > how the rules work, but people are still going to get confused. On the > other hand, we can get rid of the list/series/:by part of Ranges without > losing any power (move that capability to ... instead), and cut down on the > confusion. 1..2 is used abstractly to indicate a range, even though it's actually an iterator that will return two values. However, once you apply an explicit :by, I think you've gone past that abstraction, and it's no longer reasonable to expect that values that fall between your iterations will match.
Re: [perl #69194] rakudo 2009-08 and when with lists
Redirecting thread to language because I do agree that this is no longer a matter of a bug. On Fri, Sep 18, 2009 at 9:28 AM, Moritz Lenz via RT < perl6-bugs-follo...@perl.org> wrote: > On Thu Sep 17 08:53:59 2009, ajs wrote: > > This code behaves as expected, matching 2 or 3 in only one out of the > three > > cases: > > You say yourself that it behaves as expected, I don't see any bug. > > Yeah, I dropped a comma. That should have been "This code behaves as expected, matching 2 or 3, in only one out of the three cases. > The case 2..3 (Range) is pretty clear. S03/Smart Matching/ says about > Arrays: > > Any Array lists are comparable@$_ «===» X > > so it tries to interpret the LHS as a List and checks for element-wise > identity => False > I think I see where you're going, here: that ranges are explicitly called out in the spec for given (I haven't double-checked that, but I seem to recall that that's right). The problem is that we now have this rule (which is what caught me, here and made me think this was a bug): 2,3 constructs a list. 2..3 also constructs a list, unless it's in a given/when condition in which case it's just a range. That seems confusing. Is there any value at all in comparing a scalar to a list for identity by default? Why wouldn't we apply any() implicitly and get free consistency with the behavior of lists constructed with ..? Original code: my $a = 2; print "Test 1 (anon array): "; given $a { when [2,3] { say "Good" } default { say "Bad" } } print "Test 2 (..): "; given $a { when 2..3 { say "Good" } default { say "Bad" } } print "Test 3 (bare ,): "; given $a { when 2,3 { say "Good" } default { say "Bad" } }
POD classes -- a suggestion
I'd really like to be able to assign a class to POD documentation. Here's an example of why: class Widget is Bauble #= A widget is a kind of bauble that can do stuff { has $.things; #= a collection of other stuff #==XXX{ This variable needs to be replaced for political reasons } } When extracting the documentation for this class, it should appear as such: class Widget base class: Bauble A widget is a kind of bauble that can do stuff Attributes: $.things (simple scalar) -- a collection of other stuff But when extracted with a flag requesting class XXX documentation, it should include the additional line: This variable needs to be replaced for political reasons This has many uses: - Keeping customer-visible and internal documentation in the same file - Allowing easy access to just the documentation bits that you might be interested in - Could be extended to allow for injecting documentation into other modules that are being extended, but in a way that allows access to the original documentation on its own - This might expose the implementation of features used to control debugging, warnings (e.g. the equivalent of "no strict", but with documentation as to why) and lint-like facilities - One of my usual gripes about doc systems is that they document elements of a program or library and not its function. Given this feature it would be easy to distinguish between the two and perhaps even require either or both depending on what's being parsed (e.g. a program might require only functional documentation where a library might require both functional and element-level docs).
Re: S26 - The Next Generation
I'm jumping in on an old conversation because I only just had time to catch up last night. I have a few questions that I think are probably still pertinent. On Sun, Aug 16, 2009 at 4:26 PM, Damian Conway wrote: > > Executive summary: > > * Pod is now just a set of specialized forms of Perl 6 comment > > * Hence it must always parsed using full Perl 6 grammar: perl6 -doc > > This one seems relatively obvious, so it's probably been proposed. I skimmed a few of the responses and didn't see it, but that means little, I'm sure. This makes me wonder about other languages. I've been known to use POD in strings within other languages that lack a facility for documenting a program as a facility rather than as a collection of elements (which is the javadoc et al. trap). Should there be an explicit way to step this down to just parsing the bits that are called out as pod? For example: #!/bin/sh #=notperl :leading<#> :trailing<\n> cd $1 #=head1 ... # ... #=cut Obviously causing leading #s to be stripped when evaluating the podishness of a section of the program, up to the next newline. Similarly a CDATA block in XML might specify (on its own line) #=notperl :lang :leading<< >> as the begin and end tokens of potentially valid POD sections. The evaluation of each identified section then being gated on the presence of a following = token. I can't think of a language that can't support POD in this way, but I'm sure someone will provide an example ;) Actually, in retrospect vanilla C89 might be problematic. I seem to remember that C9X introduces // so it could pull this off. I can imagine a messy solution in C using #define, but it would produce compile-time warnings in some compilers. Interestingly, this would have the side-benefit of making any program in any language into valid Perl code, given the appropriate notation at the start of the program... Kind of nifty if not strictly a practical benefit. [...] > * In addition to delimited, paragraphed, and abbreviated Pod blocks, > documentation can now be specified in a fourth form: > > my $declared_thing; #= Pod here until end of line > > sub declared_thing () { #=[ Pod here > until matching > closing bracket > ] >... > } > There is no explicit mention in the document as to what happens at the Perl level when the closing bracket is reached at a point that is not the end of a line (i.e. it is not followed by whitespace that contains a newline character). Here's an example: my $a #-[stop the presses!] = 4; I'm not sure that I even think this is a good idea (nor a bad one, for that matter), but the documentation does not make this clear. It seems likely that the expected behavior is for Perl to treat the # as the start of a comment, even though it encounters parsable pod thereafter, and to continue to process the remaining part of the line as a comment, however this brings multi-line bracketted POD into question: sub fly($like, $to, $spirit) #=[ time keeps on slippin' ] { # error - this brace is not considered code? ... } fly(:like('eagle'), :to('sea'), :spirit('carry me')) > > * This new declarator block form is special. During source code parsing, > normal Pod blocks are simply appended into an array attribute of > surrounding Comment object in the AST (which is just $=POD, at the > top level). However declarator blocks are *also* appended to the > .WHY attribute of the declarator at the start of the most recent > (or immediately following) line. > I'd very much like to establish that at default optimization levels for execution, this information is not guaranteed to be maintained past the creation of the AST. This allows optimizations which might place declared elements into types which cannot maintain additional data (e.g. a Parrot I-register). Perhaps in some cases we would want to provide such guarantees. I wouldn't be opposed to an explicit way to request such a guarantee. For example: sub junk($things) is documented #= junk happens { ... } Now, even if junk is inlined and optimized away, we guarantee that its documentation will continue to be stored in some way that can be retrieved. This might even prevent certain classes of optimizations, but that's implementation specific. > * Hence, they can be used to document code directly, which >documentation can then be pulled out by introspecting either >$=POD or the declarators themselves. Documented declarators >look like this: > Although it's something that could be added on after-the-fact, I think it's worth calling for this up-front: All of your comments about .WHY seem to indicate that it behaves recursively, walking the tree and pulling out any documentation for child nodes. That's fine, but there really should be a user-accessible and well defined way to limit that
Re: Testing Perl 6 analog to Perl 5's tie.
On Sun, Aug 2, 2009 at 1:10 PM, Moritz Lenz wrote: > Let's pick up this old mail before it gets completely warnocked ;-) > > For the record, this discussion only applies to scalar implementation > types. For example for Arrays I expect things to work by overriding the > method postcircumfix:<[ ]>. Really? What about: my ImplementationType @foo; @foo = 1..Inf;
Fwd: Re-thinking file test operations
Sorry, I sent this just to Mark. Wasn't my intention. -- Forwarded message -- From: Aaron Sherman Date: Fri, Jul 10, 2009 at 6:58 PM Subject: Re: Re-thinking file test operations To: "Mark J. Reed" On Fri, Jul 10, 2009 at 5:31 PM, Mark J. Reed wrote: > I'd rather see all those key types be separate classes, maybe > subclasses of a generic KeyStr class. In re-thinking it, we don't need to do either. It's already built in: $str does Path; if $str.e { say("$str exists"); } Nice and simple. All someone has to do is write the appropriate Path that knows what to do when it's added to a Str. I wonder if this works: if ($str but Path).e { ... } > The question is how to specify them. > I think stringish classes are common and useful enough to have > special literal support available without having to customize the > grammar. Maybe there's a registry of prefixes that can be put in > front of the opening quote, like p'' for a pathname, or maybe you > have to use a q operator with a modifier. You've made contradictory statements, there. Either you want to change the grammar to add new quoting styles (then the argument ensues: is YOUR key type common enough to deserve a quoting semantic?) or you think that you shouldn't have to customize the grammar. I'm in favor of NOT customizing the grammar, but at the same time, I readily admit that strings aren't always strings, and might have much more semantic baggage that it would be good to be able to associate with them easily. > > On 7/10/09, Aaron Sherman wrote: > > On Thu, Jul 9, 2009 at 6:22 PM, Moritz Lenz wrote: > > > >> > >> $str.File.e # same, different names > > > > > > Brainstorming a bit here > > > > Str is a class that describes collections of characters (among some other > > typographical constructs, yadda, yadda, Unicode, yadda). > > > > There is a commonly used special case, however, where my Str is not just > a > > Str. It is, instead, a key for an external datasource. Such cases > include: > > > > * URIs. > > * Pathnames > > * Usernames > > * Variable names > > * etc. > > > > It makes sense to handle these cases in some regular way, and to provide > > those hooks via Str because it is relatively uniquely Str's job to hold > > these things (counter-examples include UIDs). > > > > OK, so we have a need for some hookable interface on Str for accessing > > external datasources by key. Let's call it "key". > > > > $str = "/etc/aliases" > > $str.e; # error, no such method > > $str.key(::File); > > $str.e; # works fine > > > > There should probably be a few standard methods that are imported in this > > way such as e, s, z and any other test operators that are universal, so > that > > this makes sense: > > > > $user = get_user_name(); > > $user.key(::Getpw) > > $user.e; # user exists? > > > > $url = "http://www.example.com/";; > > $url.key(::URI); > > $url.s > 0; # might do a HEAD request and return the size > > > > The rest might be more domain specific: > > > > $mailbox = "Trash"; > > $mailbox.key(::Mail::IMAP, $account_info); > > $mailbox.msgs > 1000; > > > > In this way you have not enforced the assumption that all strings are > > pathnames, but rather that all strings might be used as keys. > > > > I suppose this even makes sense, though it's convoluted: > > > > $hashkey = "Aaron"; > > $hashkey.key(::Hash, %names); > > $hashkey.e > > > > The real beauty of this is that it can all be implemented without any > > language syntax/grammar changes. > > > > -- > Sent from my mobile device > > Mark J. Reed >
Re: Re-thinking file test operations
On Thu, Jul 9, 2009 at 6:22 PM, Moritz Lenz wrote: > > $str.File.e # same, different names Brainstorming a bit here Str is a class that describes collections of characters (among some other typographical constructs, yadda, yadda, Unicode, yadda). There is a commonly used special case, however, where my Str is not just a Str. It is, instead, a key for an external datasource. Such cases include: * URIs. * Pathnames * Usernames * Variable names * etc. It makes sense to handle these cases in some regular way, and to provide those hooks via Str because it is relatively uniquely Str's job to hold these things (counter-examples include UIDs). OK, so we have a need for some hookable interface on Str for accessing external datasources by key. Let's call it "key". $str = "/etc/aliases" $str.e; # error, no such method $str.key(::File); $str.e; # works fine There should probably be a few standard methods that are imported in this way such as e, s, z and any other test operators that are universal, so that this makes sense: $user = get_user_name(); $user.key(::Getpw) $user.e; # user exists? $url = "http://www.example.com/";; $url.key(::URI); $url.s > 0; # might do a HEAD request and return the size The rest might be more domain specific: $mailbox = "Trash"; $mailbox.key(::Mail::IMAP, $account_info); $mailbox.msgs > 1000; In this way you have not enforced the assumption that all strings are pathnames, but rather that all strings might be used as keys. I suppose this even makes sense, though it's convoluted: $hashkey = "Aaron"; $hashkey.key(::Hash, %names); $hashkey.e The real beauty of this is that it can all be implemented without any language syntax/grammar changes.
Re: Huffman's Log: svndate r27485
On Fri, Jul 10, 2009 at 3:32 PM, Mark J. Reed wrote: > The clash between 'log' for 'logarithm' and 'log' for 'write to log > file' is unfortunate, but since you have to define logging parameters > somewhere anyway, I'm OK with having to call that sort of log as a > method on a logger object instead of as a top-level sub. It none-the-less leaves ambiguity. I think the right way to attack it is to have log() be an un-imported-by-default alias for the automatically imported function in both logging and math modules. Thus someone who is just too attached to log() can have it, but everyone else can get by with the imported-by-default name. So, for example: logbase($x,$base) = log in the given base, no default log10($x) = logbase($x,10), log in base 10 logn($x) = logbase($x,$Math::e), log in base e log($x) = unexported logbase($x,$base=$Math::e), log in base e by default Most people will likely use logn and log10 most of the time, and these names are not unique to Perl.
Re: Runtime role issues
Ovid wrote: The "intermediate class" solves the problem but it instantly suggests that we have a new "design pattern" we have to remember. Basically, if I can't lexically scope the additional behavior a role offers, I potentially need to remove the role or use the "intermediate class" pattern. my Dog $dog .= new; my $junkyard = $dog but Guard; You probably don't need to touch the class, but a particular object. You can lexically scope changes to an object using but and my quite easily. If you really need a modified class, then I think this would do it, but I'm not sure if it works: my $junkyarddog = class is Dog does Guard {}; my ::($junkyarddog) $spot .= new;
P5's s[pat][repl] syntax is dead
@larry[0] wrote: Log: P5's s[pat][repl] syntax is dead, now use s[pat] = "repl" Wow, I really missed this one! That's a pretty big thing to get my head around. Are embedded closures in the string handled correctly so that: s:g[\W] = qq{\\{$/}}; Will do what I seem to be expecting it will? How will that be defined in the Perl6-based parser? Will macros be able to act as an LVALUE and modify their RVALUE in this way, or is this just some unholy magic in the parser? + s[pattern] = doit() + s[pattern] = eval doit() [...] +There is no syntactic sugar here, so in order to get deferred +evaluation of the replacement you must put it into a closure. The +syntactic sugar is provided only by the quotelike forms. [...] +This is not a normal assigment, since the right side is evaluated each +time the substitution matches (much like the pseudo-assignment to declarators +can happen at strange times). It is therefore treated as a "thunk", that is, +as if it has implicit curlies around it. In fact, it makes no sense at +all to say + +s[pattern] = { doit } Please clarify "quotelike forms", since to my untrained eye, the above appeared to be contradictory at first (I read "quotelike forms" as s/// not s{...}). Very interesting.
Capturing subexpression numbering example
The example in S05 under "Subpattern numbering" isn't quite complex enough to give the reader a full understanding of the ramifications of the re-numbering that occurs with alternations, especially with respect to the combination of capturing and non-capturing subpatterns. I've written a small example and explanation to address this (attached as diff) based on an IRC conversation with fglock. If it's deemed correct, could this be included, please? --- Rule.pod.orig 2006-10-10 17:26:39.0 -0400 +++ Rule.pod 2006-10-10 17:37:17.0 -0400 @@ -1956,6 +1956,21 @@ C<(undef, undef, undef, undef, undef, undef, 'every', 'green', 'BEM', 'devours', 'faces')> (as the same regex would in Perl 5). +If non-capturing brackets are used, this can become more complex. +Every time C<|> appears inside of non-capturing brackets, the subpattern +index is returned to the index that it had when entering the +brackets. When exiting the brackets, the next capturing subpattern +will have an index one higher than the highest subpattern inside +the non-capturing brackets. Here is such an example: + +#$0$1 $2$1$3 +$match = rx/ (a) [ (b) (c) | (d) ] (e) /; + +Notice that it is not the most recent C<$1> that determines +the index of the C<(e)> subpattern, but the C<(c)> subpattern that +incremented the index to C<$2>. Therefore C<(e)> has an index +of C<$3>. + =item * Note that it is still possible to mimic the monotonic Perl 5 capture
Updated: Re: Hash composers and code blocks
Aaron Sherman wrote: (updated based on followup conversations) Proposal: A sigil followed by [...] is always a composer for that type. %[...]- Hash. @[...]- Array. &[...]- Code. |[...]- Capture. Identical to \(...). $[...]- Scalar. Like item(...), but forces copying even in argument lists Added after: ::[Type:...] fglock pointed out that @(...), %(...) actually already do this. I was going to modify this proposal around that, but then I looked at the variations. I now contend that (...) has more of a "cast" semantic than a "compose" semantic. If we wish to combine the two, then we could, but that would require that &(...) take a block body, not an expression (again, not what &(...) does now, at all). In the end, I still think the bracket forms are a wonderfully simple solution to the ambiguity problem, and I'm more convinced every time I look at this that the ambiguity needs a fix. What's more, having one syntax for composition of container and non-container types in free-form data structures is tremendously appealing. We've thrown out ${...}, so we could use that instead of brackets, but that's just one more shift-key, and it doesn't seem to buy much. Am I wrong? fglock also suggested that this might not be seen by the community as "looking like perl." I'm not so sure that's the case, since we already have @(...) and @{...}, but even if some do feel that way, I AM NOT proposing that we eliminate any currently correct behavior. I am only suggesting that we add "one more way to do it" for those of us who want to dodge the ambiguity. Surely, that's not a big request? To sum up: (...) - cast expression ... to type implied by sigil [...] - composition of type implied by sigil Nice and uniform, no?
Re: Hash composers and code blocks
Mark J. Reed wrote: On 10/5/06, Aaron Sherman <[EMAIL PROTECTED]> wrote: Proposal: A sigil followed by [...] is always a composer for that type. %[...] - Hash. Unicode: ⦃...⦄ @[...] - Array. Unicode: [...] ? - Seq. Unicode: ⎣...⎤ &[...] - Code. Unicode: ⦕...⦖ |[...] - Capture. Identical to \(...). Unicode: ⦇...⦈ $[...] - Scalar. Identical to item(value). Unicode: ⦋...⦌ #[...] - A comment. Just seeing if you're paying attention ;) Are those supposed to be question marks up there (meaning "up for discussion"), or did something go awry in the email encoding (possibly on my end)? There is one occurance of ? in there (Seq has no sigil, and thus no [...] form). The rest are Unicode characters, and my headers did include: Content-Type: text/plain; charset=UTF-8; format=flowed so, I don't think there's a problem there... still, what Unicode characters are chosen (if any) is rather moot. The real issue is: do we want to have a disambiguated composer form, and if so is [...] the right choice?
Re: Hash composers and code blocks
Aaron Sherman wrote: Proposal: A sigil followed by [...] is always a composer for that type. %[...]- Hash. Unicode: ⦃...⦄ @[...]- Array. Unicode: [...] ... I left out ::, which is probably a mistake. Part of the elegance of this, IMHO, is that it behaves the same for all sigils. The body of :: should probably be a capture whose invocant (required) is a type name: ::[Foo: 1,2,:x<3>,:y<4>] Which is identical to: Foo.new(1,2,:x<3>,:y<4>) Unicode for that seems like overkill, but if we needed it, ⦗...⦘ would suffice. Thus: ⦗Foo: 1,2,:x<3>,:y<4>⦘ That gives me the visual sense that something big and heavy is being created ;-)
Re: import collisions
Jonathan Lang wrote: What if I import two modules, both of which export a 'foo' method? That's always fine unless they have exactly the same signature. In general, that's not going to happen because the first parameter is created from the invocant. Thus: use HTML4; use Math::BigFloat; should give me a: # A divide-by-string form. Who knows why. our Math::BigFloat multi method div(Math::BigFloat $self: Str $divisor) is export {...} # Generate an HTML block that contains self + new content our HTML4 multi method div(HTML4 $self: Str $more) is export {...} When exported these do not conflict. Only plain multis would conflict: module A; our Str multi foo(Str $x) is export {...} module B; our Str multi foo(Str $x) is export {...} IMHO, it would be nice if this sort of situation was resolved in a manner similar to how role composition occurs: call such a conflict a fatal error, and provide an easy technique for eliminating such conflicts. I think that should probably be a warning under "use warnings", and otherwise ignored. You asked for the modules in a particular order, and that's the way they were merged. Overlaps are bound to happen sometimes. use Foo (foo => 'bar'); use Bar (:foo); That's an advisory, of course, requesting that aliases be created, but how much work that does depends on how many signatures are defined for each name. You could still call Foo::foo, of course, but yeah, this makes fine sense.
Hash composers and code blocks
S04 now reads: == However, a hash composer may never occur at the end of a line. If the parser sees anything that looks like a hash composer at the end of the line, it fails with "closing hash curly may not terminate line" or some such. my $hash = { 1 => { 2 => 3, 4 => 5 }, # OK 2 => { 6 => 7, 8 => 9 } # ERROR }; == I think this is a bit of a problem, since it leads to a number of "looks fine to the uninitiated" errors, and is likely output from code generators. In general, there's just no particularly good answer to "why can't I do that?" So, why not provide an unambiguous form for all of the sigiled types, leaving {...} as the ambiguous, half-cousin? (note: I don't think I'm the first to suggest this) Proposal: A sigil followed by [...] is always a composer for that type. %[...] - Hash. Unicode: ⦃...⦄ @[...] - Array. Unicode: [...] ? - Seq. Unicode: ⎣...⎤ &[...] - Code. Unicode: ⦕...⦖ |[...] - Capture. Identical to \(...). Unicode: ⦇...⦈ $[...] - Scalar. Identical to item(value). Unicode: ⦋...⦌ #[...] - A comment. Just seeing if you're paying attention ;) So, construction of an anonymous data structure might now look like: my $hash = %[ 1 => %[ 2 => 3, 4 => 5 ], 2 => @[ 6 => 7, 8 => 9 ], 3 => &[ %[ 10 => 11, 12 => 13 ] ] ]; Which is also: my $hash = ⦃ 1 => ⦃ 2 => 3, 4 => 5 ⦄, 2 => ⟦ 6 => 7, 8 => 9 ⟧, 3 => ⦕ ⦃ a => 1, b => 2 ⦄ ⦖ ⦄; And there is exactly no ambiguity. You can always use the old {...} if you like: my $hash = { 1 => { 2 => 3, 4 => 5 }, 2 => [ 6 => 7, 8 => 9 ], 4 => { { a => 1, b => 2 }; } }; And you get whatever confusion you deserve (I don't even know if that would do what I think it should, to be honest). As always, I tend to prefer any solution that involves placing the disambiguating bits in the FRONT, rather than at the end. I would expect that compiler-writers feel similarly.
Re: "Don't tell me what I can't do!"
Trey Harris wrote: I read it as "yes, you *can* put strictures on the using code into a library, though I wouldn't do it and would say that any module that does so shouldn't be released on CPAN for general use. ..." Hey, I have an idea. Let's write a module that enforces that! Seriously, I think you're all getting way too wound up about this. No one is going to force you to eat your peas. ;)
Mailing list archive and index
I'm noodling around with the idea of creating an archive and index of all of the messages to the mailing list over the years for purposes of quickly finding all of the messages that have definitive information on a given topic. Simply searching on Google or through my mail spool just doesn't cut it, since there's too much discussion and too little decision (I'm not calling it signal-to-noise, since that's somewhat pejorative, and I'm not trying to say the discussion is useless, just not usually what I'm looking for). To that end, I've got a mockup of what I'm thinking of with a handful of Larry's messages in it (in Mediawiki): http://www.ajs.com/perl6index/index.php/Perl.perl6.language If people like it, then I'll write a tool that automatically populates the database, and the site will probably get its own hostname for future flexibility. Ultimately the categorization (which is the important part) will have to be a manual task, but it's not quite as daunting as one might think, given a MediaWiki that contains all of the messages. Any thoughts? Here are some other starting points if you like: Everything by Larry (currently everything): http://www.ajs.com/perl6index/index.php/Category:Larry_Wall Brainstorming: http://www.ajs.com/perl6index/index.php/Category:Brainstorming Last month: http://www.ajs.com/perl6index/index.php/Category:September_2006 One other way to go would be to take all of the summaries and start with those. Then, each message could be a link from a summary Then again, that could always be put in later, and finding the mapping between summaries and threads might be a pain, programmatically.
Re: Exceptions on hypers
Aaron Sherman wrote: Damian Conway wrote: @bar».foo if $baz; That brought to mind the question that I've had for some time: how are exceptions going to work on hyper-operators? Will they kill the hyperoperation in-progress? e.g. what will $i be: Corrected example follows (there were supposed to be 10 elements): my $i = 0; class A { method inci() { die if $i++ > 5 } } my @aoa = map {A.new} 1..10; try { @aoa>>.inci; } say $i; We now return you to your regularly scheduled question, already in progress: Is it even possible to know, or is it entirely dependent on the implementation? And what do multiple, successive dies within the same try block do?
Exceptions on hypers
Damian Conway wrote: @bar».foo if $baz; That brought to mind the question that I've had for some time: how are exceptions going to work on hyper-operators? Will they kill the hyperoperation in-progress? e.g. what will $i be: my $i = 0; class A { method inci() { die if $i++ > 5 } } my Array of A @aoa; try { @aoa>>.inci; } say $i; Is it even possible to know, or is it entirely dependent on the implementation? And what do multiple, successive dies within the same try block do?
Re: Nested statement modifiers.
Paul Seamons wrote: It relates to some old problems in the early part of the RFC/Apocalypse process, and the fact that: say $_ for 1..10 for 1..10 Was ambiguous. The bottom line was that you needed to define your parameter name for that to work, and defining a parameter name on a modifier means that you have to parse the expression without knowing what the parameters are, which is ugly in a very non-stylistic sense. I don't think that is ambiguous though. It really is, and the very first question that everyone asks is: how do I get access to the outer loop variable, which of course, you cannot for the reasons stated above. Let's get P6 out the door, and then discuss what tiny details like this do or don't make sense.