Re: \x{123a 123b 123c}
HaloO, Patrick R. Michaud wrote: There's also sp, unless someone redefines the sp subrule. And in the general case that's a slightly more expensive mechanism to get a space (it involves at least a subrule lookup). Perhaps we could also create a visible meta sequence for it, in the same way that we have visible metas for \e, \f, \r, \t. But I have no idea what letter we might use there. How about \x and \X respectively? Note the *space* after it :) I mean that much more serious than it might sound err read. I hope the concept of unwritten things in the source beeing interesting values of void/undef applies always. OTOH, I'm usually not saying anything in the area of the grammar subsystem, but I still try to wrap my brain around the underlying unifyed conceptual level where rules and methods or subs and macros are indistinguishable. So, please consider this as a well wanting question. And please forgive the syntax errors. With something like # or token? perhaps even sub? macro x ( HexLiteral *[$char = 32, [EMAIL PROTECTED] ) is parsed( HexLiteral* ) {...} and \ in match strings escaping out to the macro level when the circumfix match creator is invoked, I would expect m/ \x /; # single space is required m/ \x20 /; # same m/ {x} /; # same? m/ \X /; # any single char except space m/ \x\x\x /; # exactly three spaces m/ \x[20,20,20] /; # same, as proposed by Larry m/ \xy /; # parse error 'y not a hex digit' m/ \x y /; # one space then y to insert verbatim, machine level chars into the match definition. In particular *no* lookup is compiled in. I would call \x the single character *exact* matcher and \X the *excluder*. BTW, the definition of the latter could just be X ::= !x; # or automagically defined by up-casing and outer negation if ? and ! play in the meta operator league. I don't think I like this, but perhaps C becomes ?null and Cbecomes ' '? Seems like not enough visual distinction there... I strongly agree. I would ask the moot question *how* the single space in / / is removed ---as leading, trailing or separating space---when the parser goes over it. But I would never expect the source space to make it into the compiled match code! --
Re: Multidimensional argument list binding (*@;foo)
Hi, Luke Palmer wrote: On 11/20/05, Ingo Blechschmidt [EMAIL PROTECTED] wrote: sub foo (*@;AoA) { @;AoA } my @array1 = a b c; my @array2 = d e f; my @AoA = foo @array1, @array2; say [EMAIL PROTECTED]; # 2? 1 say [EMAIL PROTECTED]; # a b c? a b c d e f However, my @AoA = foo(@array1; @array2); # all of Ingo's predictions are now correct Hm. How is (*@;AoA) different from (Array [EMAIL PROTECTED]) then? (Assuming that foo(@a; @b) desugars to foo([EMAIL PROTECTED], [EMAIL PROTECTED]).) --Ingo
Re: statement_controlfoo() (was Re: lvalue reverse and array views)
Hi, Rob Kinyon wrote: On 11/20/05, Ingo Blechschmidt [EMAIL PROTECTED] wrote: Yep. Also note that for is not a special magical construct in Perl 6, it's a simple subroutine (statement_control:for, with the signature ([EMAIL PROTECTED], Code *code)). (Of course, it'll usually be optimized.) Example: { my sub statement_control:for ([EMAIL PROTECTED], Code *code) { map code, reverse @array; } for a b c - $item { say $item } # c\nb\na\n } # for restored, as the modified for went out of scope: for a b c - $item { say $item } # a\nb\nc\n Is there a list of the statement control items that are implemented as such vs. implemented in another way? statement_control:if, statement_control:unless, statement_control:for, statement_control:while, statement_control:until, and statement_control:loop come to my mind. ??!! is proably defined as sub ternary:?? !! ($cond, $then is lazy, $else is lazy) { if $cond { $then } else { $else } } (Assuming that ternary is the correct grammatical category and is lazy DWIMs.) Of course, the compiler is free to optimize these things if it can prove that runtime's statement_control:if is the same as the internal optimized statement_control:if. --Ingo
Re: till (the flipflop operator, formerly ..)
Hi, Larry Wall wrote: On Sun, Nov 20, 2005 at 08:51:03PM +0100, Ingo Blechschmidt wrote: : according to the new S03, till is the new name for the flipflop : operator. Presuming we can make it work out as an infix macro. Ah, it's a macro. This clarifies things. : Do the flipflop operators of subroutines maintain own : per-invocation-of-the-sub states? I.e.: : : sub foo (x) { x() till 0 } : : foo { 0 }; # evaluates to a false value, of course : : foo { 1 }; # evaluates to a true value, of course : foo { 0 }; : # still true? : # (Argumentation: The flipflop is in the true state, : # so the LHS is not evaluated.) : # Or is it false? : # (Argumentation: The flipflop operator of the previous : # invocation is not the flipflop operator of the current : # invocation, so the return value is false.) It's still true. Ignoring the E0 issue, the desugar of A till B is something like: [...] Thanks very much, this code is very clear. :) : Also, all operators can be called using the subroutine form (which : is a very good thing), e.g.: : : say infix:-(42, 19); # 23 : : Is this true for till as well? : : say infix:till(LHS, RHS); Probably not. Calling macros as functions is a bit of a problem. Yep. (I assumed infix:till would be an ordinary subroutine.) : But how would infix:till maintain the state then, as no explicit : ID is passed to it? Does infix:till access an internal %states : hash, using $CALLER::POSITION as keys? That feels like a hack to me. I'd rather find a way of poking a real state variable into the caller's scope if we have to support that. Agreed. The desugar you provides feels far more sane. : Perl 5's flipflop operator appends E0 to the final sequence number : in a range, allowing searches for /E/. My guess is that this is : superseded by $sequence_number but : this_is_the_endpoint_of_the_range (you get the idea). Correct? I was just thinking that you'd use till^ if you wanted to exclude the endpoint. And ^till to exclude the beginning, and ^till^ to exclude both, just as with ..^, ^.., and ^..^. Ok. In fact, that's really my main motivation for wanting it to be infix. Otherwise it might as well be an ordinary flipflip() macro, or fromto(). Makes sense. --Ingo
Re: Multidimensional argument list binding (*@;foo)
On 11/21/05, Ingo Blechschmidt [EMAIL PROTECTED] wrote: Hm. How is (*@;AoA) different from (Array [EMAIL PROTECTED]) then? (Assuming that foo(@a; @b) desugars to foo([EMAIL PROTECTED], [EMAIL PROTECTED]).) Well, it's not at all, under that assumption. But that assumption is wrong. I think foo(@a; @b) doesn't have a sugar-free form (that is to say, it is the sugar-free form). Among things that desugar to it: @a == foo() == @b foo(@a) == @b @a == @b == foo() # maybe; don't remember To illustrate: sub foo ([EMAIL PROTECTED]) { say [EMAIL PROTECTED]; } sub bar (*@;a) { say +@;a; } foo(1,2,3; 4,5,6); # 6 bar(1,2,3; 4,5,6); # 2 That is, the regular [EMAIL PROTECTED] has concat semantics. However, I'd like to argue that it should have die semantics, for obvious reasons. Luke
Re: statement_controlfoo() (was Re: lvalue reverse and array views)
On 11/21/05, Ingo Blechschmidt [EMAIL PROTECTED] wrote: Of course, the compiler is free to optimize these things if it can prove that runtime's statement_control:if is the same as the internal optimized statement_control:if. Which it definitely can't without some pragma. I wonder if they should be macros. (Macros that would by default expand to things that aren't expressible in Perl 6) Luke
Re: \x{123a 123b 123c}
On Mon, Nov 21, 2005 at 03:23:35PM +0100, TSa wrote: Patrick R. Michaud wrote: There's also sp, unless someone redefines the sp subrule. And in the general case that's a slightly more expensive mechanism to get a space (it involves at least a subrule lookup). Perhaps we could also create a visible meta sequence for it, in the same way that we have visible metas for \e, \f, \r, \t. But I have no idea what letter we might use there. How about \x and \X respectively? Note the *space* after it :) ... If we're going to do that, I'd think it would be \c and \C instead of \x and \X . I'm not really advocating this, I'm just commenting that in this case \c seems more natural than \x. Pm
apo5 (was: Re: \x{123a 123b 123c})
Larry Wall: Juerd: Ruud: Maybe \x{123a 123b 123c} is a nice alternative of \x{123a} \x{123b} \x{123c}. Hmm, very cute and friendly! Can we keep it, please? Please? Thanks for the support. We already have, from A5, \x[0a;0d], so you can supposedly say \x[123a;123b;123c] rereading apo5 / Found it in the old/new table on page 7. For me the semicolon is fine. I am using character names more and more, and between those, semicolons are less cluttery. Character names can contain spaces, but semicolons too? If not then \c[BEL; EXTENDED ARABIC-INDIC DIGIT ZERO] would be possible, but maybe better not, or more like \c['BEL'; 'EXTENDED ARABIC-INDIC DIGIT ZERO'] or even \c('BEL', 'EXTENDED ARABIC-INDIC DIGIT ZERO'). Something else: The '^' could be used for both the ultimate start- and end-of-string. This frees the '$'. There is still the '$$' that matches before embedded newlines, and since '^^' matches after those newlines, the '^^' and '$$' can only be unified to '^^' if it is one-width inside a string, so is like '[$$\n^^]' (or just '\n') there. At start- and end-of-string the '^^' can still be a zero-width match. I am not sure about greedy (meaning to try one-width first) or non-greedy. Example: '^[(\N*)^^]*^' to capture all lines, clean of newlines. Not a lot clearer than '^[(\N*)\n*]*$', but freeing the '$' and '$$' might be worth it. mess about '^^+', '^+^' and '^*^' (bats!) removed -- Affijn, Ruud Gewoon is een tijger.
Re: \x{123a 123b 123c}
On Sun, Nov 20, 2005 at 10:27:17AM -0600, Patrick R. Michaud wrote: : On Sat, Nov 19, 2005 at 06:32:17PM -0800, Larry Wall wrote: : On Sun, Nov 20, 2005 at 01:26:21AM +0100, Juerd wrote: : : Ruud H.G. van Tol skribis 2005-11-20 1:19 (+0100): : : Maybe : : \x{123a 123b 123c} : : is a nice alternative of : : \x{123a} \x{123b} \x{123c}. : : We already have, from A5, \x[0a;0d], so you can supposedly say : \x[123a;123b;123c] : : Hmm, I hadn't caught that particular syntax in A05. AFAIK it's not : in S05, so I should probably add it, or whatever syntax we end up : adopting. Yes. : (BTW, we haven't announced it on p6l yet, but there's a new version of : S05 available.) Indeed, there are new versions of most of the S's. People who want the latest should use svn.perl.org, which also makes it easy to do diff listings with svn or svk. : [...] : But I see that the semicolon is rather cluttery, mainly because it's : too tall. I'm not sure going all the way to space is good, but we : might have : \x[123a,123b,123c] : just to get a little visual space along with the separator. : : Just to verify, with this syntax would we expect : : \x[123a,123b,123c]+ : : to be the same as : : [\x123a \x123b \x123c]+ : : and not \x123a \x123b \x123c+ ? Yes. I think the rule interpretation of \x is that it is a sequence to be considered a single character regardless of its context. Certainly the square brackets we've mandated would tend to read as grouping anyway. Of course, the main point of the \x[a,b,c] notation is to allow interpolation of sequences of hex characters into ordinary strings, and those don't care about abstract character boundaries. : It occurs to me that we didn't spec whether character classes ignore : whitespace. They probably should, just so you can chunk things: : : / [ a..z A..Z 0..9 _ ] / : : Then the question arises about whether [ \ ] is an escaped space : or a backslash, or illegal : : I vote that it's an escaped space. A backslash is nearly always \\ : (or should be imho). : : But if we make it match a backslash : or illegal, then the minimal space matcher becomes \x20, I think, : unless you graduate to \s. On the other hand, if we make it match : a space, people aren't going to read that way unless they're pretty : sophisticated... : : There's also sp, unless someone redefines the sp subrule. But you can't use sp in a character class. Well, that is, unless you write it: +[ a..z ]+sp or some such. Maybe that's good enough. : And in the general case that's a slightly more expensive mechanism : to get a space (it involves at least a subrule lookup). Perhaps : we could also create a visible meta sequence for it, in the same : way that we have visible metas for \e, \f, \r, \t. But I have : no idea what letter we might use there. Something to be said for \_ in that regard. : I don't think I like this, but perhaps C becomes ?null : and Cbecomes ' '? Seems like not enough visual distinction : there... _ maybe. I'm good with being ?null, and , being element boundary when matching lists. But I'd like to reserve for delimiting what is returned by $, the string officially matched: foo bar baz ~~ /:w foo \w+ baz/ say $/; # foo bar baz say $;# bar Or possibly foo bar baz ~~ /:w foo \w+ baz/ but that should probably mean whatever foo bar baz ~~ /:w foo « \w+ » baz/ eventually means. Which I haven't the foggiest. But we should probably reserve the brackets on general principle's sake, just because brackets are so scarce. I dunno. If «...» in ordinary code does shell quoting, maybe «...» in rules does filename globbing or some such. I can see some issues with anchoring semantics. Makes more sense on a string as a whole, but maybe can anchor on element boundaries if used on a list of filenames. I suppose one could even go as far as rule jpeg :i « *.jp{e,}g » or whatever the right glob syntax is. Larry
Re: apo5 (was: Re: \x{123a 123b 123c})
On Mon, Nov 21, 2005 at 05:49:59PM +0100, Ruud H.G. van Tol wrote: : Larry Wall: : Juerd: : Ruud: : : Maybe : \x{123a 123b 123c} : is a nice alternative of : \x{123a} \x{123b} \x{123c}. : : Hmm, very cute and friendly! Can we keep it, please? Please? : : Thanks for the support. Hey, this ain't exactly a popularity contest here... :-) : We already have, from A5, \x[0a;0d], so you can supposedly say : \x[123a;123b;123c] : : rereading apo5 / : Found it in the old/new table on page 7. For me the semicolon is fine. The fact that you say page 7 leads me to guess that you're reading it from perl.com. That's going to be the most out-of-date version. Better would be dev.perl.orgone day latency but html-ified svn.perl.orgup to the minute but only in pod In particular, the Apocalypses have little [Update:] sections that are supposed to alert you to things that have changed since the the Apo was written. (Though some of those are a little out of date right now too--I'm just working my way through A12 again.) : I am using character names more and more, and between those, semicolons : are less cluttery. Character names can contain spaces, but semicolons : too? If not then : \c[BEL; EXTENDED ARABIC-INDIC DIGIT ZERO] would be possible, but maybe : better not, or more like : \c['BEL'; 'EXTENDED ARABIC-INDIC DIGIT ZERO'] or even : \c('BEL', 'EXTENDED ARABIC-INDIC DIGIT ZERO'). None of the current names contain either semicolon or comma, so I expect they're avoiding them by policy. : Something else: : The '^' could be used for both the ultimate start- and end-of-string. : This frees the '$'. I think this is one of those aspects of regex culture that is too entrenched to remove. Besides, you have to be able to distinguish s/^/foo/ from s/$/foo/. : There is still the '$$' that matches before embedded newlines, and since : '^^' matches after those newlines, the '^^' and '$$' can only be unified : to '^^' if it is one-width inside a string, so is like '[$$\n^^]' (or : just '\n') there. But then if you use it within a capture, you get an extra newline you probably don't want. : At start- and end-of-string the '^^' can still be a zero-width match. : I am not sure about greedy (meaning to try one-width first) or : non-greedy. : : Example: '^[(\N*)^^]*^' to capture all lines, clean of newlines. : Not a lot clearer than '^[(\N*)\n*]*$', but freeing the '$' and '$$' : might be worth it. I don't think it's any clearer. In fact, I find all the ^'s there are a little too visually confusing and contextual. Larry
Re: \x{123a 123b 123c}
On Mon, Nov 21, 2005 at 09:02:57AM -0800, Larry Wall wrote: : But I'd like to reserve for delimiting what is returned by $, : the string officially matched: : : foo bar baz ~~ /:w foo \w+ baz/ : say $/; # foo bar baz : say $; # bar Though it occurs to me that there's another possible interpretation, culturally speaking. The overloading of \b has always bothered me, plus the fact that \b can't distinguish which kind of word boundary without additional context. In regex culture, we have the \...\ word matcher, and maybe that devolves to isolated ... in rules. We could still use ... to capture $, which I was leaning toward anyway just for visibility reasons, since the two ends could be quite far apart. And file globbing could just be :glob or some such if we really need to embed it in rules. Larry
Re: statement_controlfoo() (was Re: lvalue reverse and array views)
HaloO, Luke Palmer wrote: On 11/21/05, Ingo Blechschmidt [EMAIL PROTECTED] wrote: Of course, the compiler is free to optimize these things if it can prove that runtime's statement_control:if is the same as the internal optimized statement_control:if. Which it definitely can't without some pragma. Isn't the question just 'when'? I think at the latest it could be optimized JIT before the first execution, or so. The relevant AST branch stays for later eval calls which in turn branch off the sourrounding module's version from within the running system such that the scope calling the eval sees the new version. And this in turn might be optimzed and found unchanged in its optimized form. Sort of code morphing of really first class code. Everything else makes closures second class ;) --
Re: statement_controlfoo() (was Re: lvalue reverse and array views)
On Mon, Nov 21, 2005 at 03:51:19PM +, Luke Palmer wrote: : On 11/21/05, Ingo Blechschmidt [EMAIL PROTECTED] wrote: : Of course, the compiler is free to optimize these things if it can prove : that runtime's statement_control:if is the same as the internal : optimized statement_control:if. : : Which it definitely can't without some pragma. But remember that on some level or other, all declarations function as pragmas. So the absence of a redeclaration of if could be taken as a kind of pragma, if we require control redefinition to be lexically scoped, which we probably should. : I wonder if they should be macros. (Macros that would by default : expand to things that aren't expressible in Perl 6) Which is another way of saying that control redefinitions should be lexically scoped, since macros are required to do lexically scoped syntax modification unless they're Preluditudinous. Another issue in if optimization is whether the blocks in fact do anything blockish that have to be scoped to the block. This is a determination that Perl 5 makes when it's compiling blocks. It's basically an attribute that migrates up the tree from the leaves, which are mostly true, but anyone in the block can falsify the attribute for the block as a whole. Arguably, when you use ??!! and friends, it should also be doing such analysis on the lazy bits and telling you that your my is badly scoped if it's in conditional code. That also catches my $x = 0 if rand 2; Larry
Re: statement_controlfoo() (was Re: lvalue reverse and array views)
On Mon, Nov 21, 2005 at 10:45:56AM -0800, Larry Wall wrote: : Another issue in if optimization is whether the blocks in fact do : anything blockish that have to be scoped to the block. This is a : determination that Perl 5 makes when it's compiling blocks. It's : basically an attribute that migrates up the tree from the leaves, which : are mostly true, but anyone in the block can falsify the attribute : for the block as a whole. Actually, I said that backwards. It starts out false and gets truified if anyone says Yes, we gotta have a block around us. Larry
Re: Multidimensional argument list binding (*@;foo)
Hi, Luke Palmer wrote: On 11/21/05, Ingo Blechschmidt [EMAIL PROTECTED] wrote: Hm. How is (*@;AoA) different from (Array [EMAIL PROTECTED]) then? (Assuming that foo(@a; @b) desugars to foo([EMAIL PROTECTED], [EMAIL PROTECTED]).) Well, it's not at all, under that assumption. But that assumption is wrong. Aha! FYI, I got that interpretation from r6628 of S09 [1]: The following two constructs are structurally indistinguishable: (0..10; 1,2,4; 3) ([0..10], [1,2,3,4], [3]) I think foo(@a; @b) doesn't have a sugar-free form (that is to say, it is the sugar-free form). Among things that desugar to it: @a == foo() == @b foo(@a) == @b @a == @b == foo() # maybe; don't remember To illustrate: sub foo ([EMAIL PROTECTED]) { say [EMAIL PROTECTED]; } sub bar (*@;a) { say +@;a; } foo(1,2,3; 4,5,6); # 6 bar(1,2,3; 4,5,6); # 2 That is, the regular [EMAIL PROTECTED] has concat semantics. However, I'd like to argue that it should have die semantics, for obvious reasons. Just to clarify -- only ; with *@;a should have die semantics, , with *@;a should continue to work, right? (If so, I agree.) Could you provide some more examples with ;, please? In particular, what are the results of the following expressions? (42; 23) (@a; @b) (@a; @b)[0] (@a; @b)[0][0] ((42;23); (17;19)) ((@a;@b); (@c;@d)) *(42; 23) *(@a; @b) ( (42; 23), 19) (*(42; 23), 19) [42; 23] [EMAIL PROTECTED]; @b] Thanks very much, --Ingo [1] http://svn.perl.org/perl6/doc/trunk/design/syn/S09.pod /The semicolon operator
Re: apo5
Larry Wall: Ruud H.G. van Tol: dev.perl.org one day latency but html-ified svn.perl.org up to the minute but only in pod Thanks, much better. Can't say that I haven't been there before. There is a [[:alpha:][:digit:] and a [[:alpha:][:digit]] on the A5-page. The '^' could be used for both the ultimate start- and end-of-string. This frees the '$'. I think this is one of those aspects of regex culture that is too entrenched to remove. Yes, I have experienced that with some of my procmail-recipes that use '^' to match embedded newlines. In procmail the '^^' matches begin- or end-of-string. Both a '^' and a '$' can be used to match a real or putative newline. Some people replaced my '^'s with '$'s. OK, everybody can stop reading here, no serious attempts below. Within C++, there is a much smaller and cleaner language struggling to get out, which would ... have been an unimportant cult language. (Bjarne Stroustrup, The Design and Evolution of C++). Besides, you have to be able to distinguish s/^/foo/ from s/$/foo/. 's/$/foo/' becomes 's/after .*/foo/' g There is still the '$$' that matches before embedded newlines, and since '^^' matches after those newlines, the '^^' and '$$' can only be unified to '^^' if it is one-width inside a string, so is like '[$$\n^^]' (or just '\n') there. But then if you use it within a capture, you get an extra newline you probably don't want. Place the ^^ outside the (). I wasn't sure about the default for the greediness of '^^' at begin- or end-of-string, I guess non-greediness can be arranged with a trailing '?'. At start- and end-of-string the '^^' can still be a zero-width match. I am not sure about greedy (meaning to try one-width first) or non-greedy. Example: '^[(\N*)^^]*^' to capture all lines, clean of newlines. Not a lot clearer than '^[(\N*)\n*]*$', but freeing the '$' and '$$' might be worth it. I don't think it's any clearer. Pardon my Dutch, I didn't find it clearer either (but, might be worth it). In fact, I find all the ^'s there are a little too visually confusing and contextual. /^ # BoS [# start of non-capturing group (\N*) # capture a substring of non-newlines ^^ # newline or EoS ]* # end of non-capturing group, repeat ^/x# EoS As I just said, I am used to '^^' as start- and end-of-buffer, and '^' as matching a real or putative newline, because of procmail. -- Grtz, Ruud
Re: statement_controlfoo() (was Re: lvalue reverse and array views)
On 11/21/05, TSa [EMAIL PROTECTED] wrote: HaloO, Luke Palmer wrote: On 11/21/05, Ingo Blechschmidt [EMAIL PROTECTED] wrote: Of course, the compiler is free to optimize these things if it can prove that runtime's statement_control:if is the same as the internal optimized statement_control:if. Which it definitely can't without some pragma. Isn't the question just 'when'? I think at the latest it could be optimized JIT before the first execution, or so. The relevant AST branch stays for later eval calls which in turn branch off the sourrounding module's version from within the running system such that the scope calling the eval sees the new version. And this in turn might be optimzed and found unchanged in its optimized form. Sort of code morphing of really first class code. Everything else makes closures second class ;) This is very close to a proposal I made to the ruby-dev mailing list (which was Warnocked). I proposed a very basic engine that would work with the parser/lexer to determine what action to take instead of using the huge case statements that are the heart of both P5 and Ruby. It would look something like: TOKEN: while ( my $token = get_next_token(params) ) { for my $length ( reverse length($token) .. 1 ) { if ( my $actions = find_actions( substr( $token, 0, $length ) ) ) { $action-[-1]-( params, if necessary ); } next TOKEN; } throw SyntaxError; } The for-loop + substr() would be to handle longest-token-first rules. So, ... is correctly recognized instead of handled as .. and .. The key would be that the $actions arrayref would get push'ed/pop'ed as you enter/leave a given lexical scope. Obviously, this could be optimized to an extremely large degree, but it -should- work. Rob
Re: Multidimensional argument list binding (*@;foo)
On Sun, Nov 20, 2005 at 09:11:33PM +0100, Ingo Blechschmidt wrote: : Also, is specifying other, non-slurpy arguments prior to a slurpy : @;multidim_arglist legal? Yes, though we have to be careful about what happens when we bind the entire first dimension and then get a == boundary. That's probably not intended to produce an empty first dimension. On the other hand, maybe it just falls out of the policy that .[] is a null dimension unless you actually put something there, and maye that extends to .[;stuff] and even .[stuff]. : E.g.: : : sub bar ($normal_var, @;AoA) {...} : bar 42, @array1, @array2; : # $normal_var is 42, : # @AoAis ([EMAIL PROTECTED], [EMAIL PROTECTED]) : # Correct? No, these are specifically not AoA. : The existence of a @array variable does not imply the existence of a : @;array variable, right? I think it probably does, or should. @;array is sugar for something like [;[EMAIL PROTECTED], presuming that .specs clumps its slices into single iterators (which it doesn't), and also presuming that you could use [;] in a declarative context (which you can't). So it's more like [;[EMAIL PROTECTED], which is an array of slice generators each of which is a sublist of iterators. It's probably not an array of arrays internally, but just a list of specs with some of them marked as starting a new dimension. We originally were modeling the multidimension stuff on AoA, but we kept getting tangled up in intentional vs unintentional brackets. We need to be able to support the userland flat view of an array and still be able to get at its specs anyway, so this is basically trying to handle multislices/multidims/multipipes with the same underlying mechanism. Larry
Re: Multidimensional argument list binding (*@;foo)
On Mon, Nov 21, 2005 at 03:48:30PM +, Luke Palmer wrote: : To illustrate: : : sub foo ([EMAIL PROTECTED]) { : say [EMAIL PROTECTED]; : } : sub bar (*@;a) { : say +@;a; : } : foo(1,2,3; 4,5,6); # 6 : bar(1,2,3; 4,5,6); # 2 : : That is, the regular [EMAIL PROTECTED] has concat semantics. However, I'd like to : argue that it should have die semantics, for obvious reasons. Well, that can be argued both ways. The Unix shells get along very well with default concat semantics, thank you: (echo foo; echo bar; echo baz) | grep a And it's rather Perlish to give you a level of flattening for free when it comes to lists. And I'd like to be able to distinguish: my @foo := gather { for @whatever { take .generate(); } } from my @;foo := gather { for @whatever { take .generate(); } } though I think maybe I'm arguing that the ; there is just documentation if @;foo and @foo are really the same variable, and it's the differing usage in rvalue context that desugars @;foo to [;]foo.dims. Larry
Re: Multidimensional argument list binding (*@;foo)
On Mon, Nov 21, 2005 at 07:49:16PM +0100, Ingo Blechschmidt wrote: : Aha! FYI, I got that interpretation from r6628 of S09 [1]: : The following two constructs are structurally indistinguishable: : : (0..10; 1,2,4; 3) : ([0..10], [1,2,3,4], [3]) Sorry, started revising that one a couple days ago and got sidetracked... Larry
Re: statement_controlfoo() (was Re: lvalue reverse and array views)
On Mon, Nov 21, 2005 at 02:05:31PM -0500, Rob Kinyon wrote: : This is very close to a proposal I made to the ruby-dev mailing list : (which was Warnocked). I proposed a very basic engine that would work : with the parser/lexer to determine what action to take instead of : using the huge case statements that are the heart of both P5 and Ruby. : It would look something like: : : TOKEN: : while ( my $token = get_next_token(params) ) { : for my $length ( reverse length($token) .. 1 ) { : if ( my $actions = find_actions( substr( $token, 0, $length ) ) ) { : $action-[-1]-( params, if necessary ); : } : next TOKEN; : } : throw SyntaxError; : } : : The for-loop + substr() would be to handle longest-token-first rules. : So, ... is correctly recognized instead of handled as .. and .. : The key would be that the $actions arrayref would get push'ed/pop'ed : as you enter/leave a given lexical scope. : : Obviously, this could be optimized to an extremely large degree, but : it -should- work. Let's see, where did I put my stash of generic quotes? Ah, there is is. Those who do not understand XXX are doomed to reinvent it, poorly. ~~ s/XXX/the Perl 6 grammar engine/; In particular, you've just reinvented the magic hash semantics mandated by P6 rules, except that P6 has some hope of optimizing the lookups to a cached trie or whatever. Larry
Re: apo5
On Mon, Nov 21, 2005 at 07:57:59PM +0100, Ruud H.G. van Tol wrote: : There is a [[:alpha:][:digit:] and a [[:alpha:][:digit]] on the : A5-page. Hmm, well, thanks--I went to fix it and I see Patrick beat me to the fix. But in one of the updates, it says: +[Update: Actually, that's now written C +alpha+digit , avoiding +the mistaken impression entirely.] And it occurs to me that we could probably allow alpha+digit there since there's no ambiguity what alpha means, and we're already claiming the next character after the opening word to decide how to process the rest of the text inside angles. Even if someone writes alpha + digit that would fail under the current policy of treating + digit as rule, since you can't start a rule with +. Unfortunately, though, identchar - digit would be ambiguous, and/or wrong. Could allow whitespace there if we picked an explicit this is rule character. Did we remove this is string? If so, we could swipe the colon: after: --help Could put back this is string with explicit quotes: after '--help' but that doesn't save much over after('--help') which is partly why we removed this is string in the first place. Larry
Ponie Inquiry
All: Back in the summer of 2003, Fotango offered financial support for Ponie development for 2 years. Nicholas took up the development hat after Arthur, but things are awfully quiet. Since summer 2005 has come and gone, I wonder if funding has been extended. I know that Nicholas opened up the repository to the public http://use.perl.org/~nicholas/journal/24649 but is anyone else working on the project? With the excitement of Perl6, Parrot, and Pugs I wonder if Ponie is being neglected. Inquiring minds want to know. Cheers, Joshua Gatcomb a.k.a. Limbic~Region
Re: apo5
Larry Wall: in one of the updates, it says: +[Update: Actually, that's now written C +alpha+digit , avoiding +the mistaken impression entirely.] In dev's A05.html I only found: [Update: That must now be written +alpha+digit, or it will be mistaken for «alphadigit», which doesn't work too well.]. I see those character classes as infinite sort-of-binary masks, so alpha|digit looks right to me. Idem [_] | alpha | digit !Swedish, with left-to-right application. (I don't oversee the consequences.) -- Grtz, Ruud (sober.u is on the loose)
Re: statement_controlfoo() (was Re: lvalue reverse and array views)
On Mon, Nov 21, 2005 at 11:43:21AM -0800, Larry Wall wrote: : Let's see, where did I put my stash of generic quotes? I would like to publicly apologize for my remarks, which were far too harsh for the circumstances. I can only plead that I was trying to be far too clever, and not thinking about how it would come across. No, to be perfectly honest, it was more culpable than that. I had a niggling feeling I was being naughty, and I ignored it. Shame on me. I will try to pay better attention to my conscience in the future. Larry
Re: apo5
Patrick R. Michaud: 's/$/foo/' becomes 's/after .*/foo/' g Uh, no, because after is still a zero width assertion. :-) That's why I chose it. It is not at the end-of-string? perl5 -e '$_=abc; s/(?=...)/x/; print' perl5 -e '$_=abc; s/(?!.)/x/; print' 's/!before ./foo/' -- Grtz, Ruud
Re: apo5
Larry Wall skribis 2005-11-21 12:08 (-0800): Unfortunately, though, identchar - digit would be ambiguous, and/or wrong. Well, we could of course change - to mean -1 or fewer, as + means +1 or more... :D Juerd -- http://convolution.nl/maak_juerd_blij.html http://convolution.nl/make_juerd_happy.html http://convolution.nl/gajigu_juerd_n.html
Re: apo5
Patrick R. Michaud: Ruud H.G. van Tol: Patrick R. Michaud: Ruud H.G. van Tol: 's/$/foo/' becomes 's/after .*/foo/' Uh, no, because after is still a zero width assertion. :-) That's why I chose it. It is not at the end-of-string? Because .* matches , /after .*/ would be true at every position in the string, including the beginning, and this is where foo would be substituted. I expected greediness, also because after .*? could behave non-greedy. Just like: s/(.*)/$1foo/ s/(.*?)/$1foo/ OK, so 's/!before ./foo/' it must be. But why does after .* behave non-greedy? -- Grtz, Ruud
Perl 6 Summary for 2005-11-14 through 2005-11-21
Perl 6 Summary for 2005-11-14 through 2005-11-21 All~ Welcome to another Perl 6 Summary. The attentive among you may notice that this one is on time. I am not sure how that happened, but we will try and keep it up. On a complete side note, I think there should be a Perl guild of some sort on World of Warcraft. It should probably be horde if there is, both because I hate the alliance and because it fits better. Perl 6 Language As usual for Pugs, most development continued off list. http://xrl.us/iipt Too Lazy? Luke Palmer posted a problem he was having with pugs. Warnock applies (which likely means it was made into a test and fixed). http://xrl.us/iipu Assigning to Named Subrules Jerry Gay had a question about the semantics of assigning to named subrules in PGE. Patrick explained that it created an array of capture objects. http://xrl.us/iipv Keyed Access to Match Objects Jerry Gay was having trouble with keyed access to match objects. After some discussion he implemented the keyed routine he needed and threatened to implement a few more. http://xrl.us/iipw PGE Now compreg s Patrick announced that PGE was now a better citizen in the parrot world, using compreg to locate the compiler instead of find_global. http://xrl.us/iipx Parrot I am going to get an English muffin. More in a moment... much better. Peanut butter is a wonderful thing. Where was I? Character Classes Done Jerry Gay wondered if the TODO about strings and character classes was still open. Patrick said it was resolved and should be closed. http://xrl.us/iipy rx_grammar.pl Progress? Jerry Gay wondered if rx_grammar.pl had seen any work lately. Warnock applies. http://xrl.us/iipz N Registers No Longer Get Whacked Leo, thanks to his new calling scheme, closed an RT ticket from Dec 2004. http://xrl.us/iip2 Report SVN Revision in parrotbug? Jerry Gay resurrected an old ticket wondering whether to add a revision field to RT tickets. http://xrl.us/iip3 Making Parrot Potable Florian Ragwitz was having trouble drinking Parrot so he wants to expend some effort to make it more potable. Apparently it does not get drunk so well by many machines in debian's build farms and he would like to fix it. When he asked how best to do his work (so as not to upset to many), Chip suggested a local SVK mirror. Hopefully after he is done even more people will be able to enjoy drinking the Parrot kool-aid. http://xrl.us/iip4 pbc_merge Requires LINK_DYNAMIC Nick Glencross provided a patch fixing pbc_merge on HP-UX. François Perrad noted that it was also problem on Win32. Jonathan Worthington explained that he was aware of the problem and that the dependency on the dynamic libraries would soon be removed. http://xrl.us/iip5 Compilable Option Will Coleda wants a -c option which will only tell you if the code is compilable for Parrot. http://xrl.us/iip6 Clerihewsiwhatsit? Inspired by Piers's inspiration from his name, Roger Browne wrote a Clerihew. Piers and Roger scare me. http://xrl.us/iip7 Debug Segments There was much discussion about what sort of interface to expose to HLL for debug segments. It looks like something good will come out of it all. http://xrl.us/iip8 Amber for Parrot version 0.3.1 Announced, Roger Browne displaying Amber 0.3.1 aroun': this latest version, magic cookie, is more than just a rookie. http://xrl.us/iip9 t/library/streams.t Failing Patrick is still having trouble with t/library/streams.t. It sounds like he would appreciate help, but Warnock applies. http://xrl.us/iiqa PGE::glob Issues Will Coleda spotted a problem with PGE::Glob. Patrick fixed it. http://xrl.us/iiqb find_word_boundary Unneeded Patrick posted his explanation of why find_word_boundary was an unneeded opcode. Too that end he posted a patch updating t/op/string_cs.t. Warnock applies to both thoughts. http://xrl.us/iiqc http://xrl.us/iiqd Coroutines Trample Scratchpads Nick Glencross noted that coroutine_3.pasm was trampling some memory. Leo said that scratchpads were on their way out. Nick wondered if the ticket should be closed now, or when this is fixed. I vote that we not close tickets until the problem is gone, but Warnock applies. http://xrl.us/iiqe MD5 Broken Chip noticed that MD5 was horribly broken recently. He decided that parrot should avoid it in favor of the SHA-2 family and maybe Whirlpool. If you are a crypto dork, you have your job cut out for you. http://xrl.us/iiqf Joshua Hoblitt joyously closed an RT ticket about removing $(MAKE_C). http://xrl.us/iiqg inconsistent dll linkage Jerry Gay announce that the last MSVC 7.1
Re: apo5
Patrick R. Michaud: Ruud H.G. van Tol: 's/$/foo/' becomes 's/after .*/foo/' Uh, no, because after is still a zero width assertion. :-) That's why I chose it. It is not at the end-of-string? Because .* matches , /after .*/ would be true at every position in the string, including the beginning, and this is where foo would be substituted. I expected greediness, also because after .*? could behave non-greedy. ... But why does after .* behave non-greedy? I think you may be misreading what after .* does -- it's a lookbehind assertion. No, I was no longer misreading it, I was questioning its rationale. I wondered what would be lost if the construct would behave more like 's/(.*)/$1foo/'. Sorry for not making that more explicit. I was still getting rid of the '$'. And monitoring the outbreak of sober.u. The greediness of the .* subpattern in after .* doesn't affect things at all -- after .* is still a zero-width assertion. There is a zero-width 'slot' before (and after) each character in the pattern string. As a zero-width assertion, 'after .*' has no sense, no 'self', since it can't move the match position to another slot. In 'after ab*', the 'b*' means nothing. In 'after ab+', the '+' means nothing. In 'after .*a', the '.*' means nothing. Unless the meaning of 'after .*a' would be changed to: try the last 'a' first. -- Grtz, Ruud
dis-junctive patterns
In pugs, r7961: my @pats = /1/, /2/; say MATCH if 1 ~~ any @pats; # MATCH say MATCH if 0 ~~ any @pats; # no match So far so good. But: my $junc = any @pats; say MATCH if 1 ~~ $junc; # no match say MATCH if 0 ~~ $junc; # no match Bug? Feature? -- Gaal Yahas [EMAIL PROTECTED] http://gaal.livejournal.com/