Re: \x{123a 123b 123c}

2005-11-21 Thread TSa

HaloO,

Patrick R. Michaud wrote:

There's also sp, unless someone redefines the sp subrule.
And in the general case that's a slightly more expensive mechanism 
to get a space (it involves at least a subrule lookup).  Perhaps 
we could also create a visible meta sequence for it, in the same 
way that we have visible metas for \e, \f, \r, \t.  But I have 
no idea what letter we might use there.


How about \x and \X respectively? Note the *space* after it :)
I mean that much more serious than it might sound err read.
I hope the concept of unwritten things in the source beeing
interesting values of void/undef applies always.

OTOH, I'm usually not saying anything in the area of the grammar
subsystem, but I still try to wrap my brain around the underlying
unifyed conceptual level where rules and methods or subs and macros
are indistinguishable. So, please consider this as a well wanting
question. And please forgive the syntax errors.

With something like

   # or token? perhaps even sub?
   macro   x ( HexLiteral *[$char = 32, [EMAIL PROTECTED] )
   is parsed( HexLiteral* )
   {...}

and \ in match strings escaping out to the macro level when
the circumfix match creator is invoked, I would expect

   m/  \x   /;  # single space is required
   m/  \x20 /;  # same
   m/ {x} /;  # same?
   m/  \X   /;  # any single char except space
   m/  \x\x\x   /;  # exactly three spaces
   m/  \x[20,20,20] /;  # same, as proposed by Larry
   m/  \xy  /;  # parse error 'y not a hex digit'
   m/  \x y /;  # one space then y

to insert verbatim, machine level chars into the match definition.
In particular *no* lookup is compiled in.

I would call \x the single character *exact* matcher and \X
the *excluder*. BTW, the definition of the latter could just be

   X ::= !x; # or automagically defined by up-casing and outer negation

if ? and ! play in the meta operator league.


I don't think I like this, but perhaps  C   becomes ?null 
and Cbecomes ' '?  Seems like not enough visual distinction

there...


I strongly agree. I would ask the moot question *how* the single space
in / / is removed ---as leading, trailing or separating space---when the
parser goes over it. But I would never expect the source space to make it
into the compiled match code!
--


Re: Multidimensional argument list binding (*@;foo)

2005-11-21 Thread Ingo Blechschmidt
Hi,

Luke Palmer wrote:
 On 11/20/05, Ingo Blechschmidt [EMAIL PROTECTED] wrote:
 sub foo (*@;AoA) { @;AoA }

 my @array1 = a b c;
 my @array2 = d e f;

 my @AoA = foo @array1, @array2;
 say [EMAIL PROTECTED]; # 2?
 
 1
 
 say [EMAIL PROTECTED];  # a b c?
 
 a b c d e f
 
 However,
 
 my @AoA = foo(@array1; @array2);
 # all of Ingo's predictions are now correct

Hm. How is (*@;AoA) different from (Array [EMAIL PROTECTED]) then? (Assuming 
that
foo(@a; @b) desugars to foo([EMAIL PROTECTED], [EMAIL PROTECTED]).)


--Ingo



Re: statement_controlfoo() (was Re: lvalue reverse and array views)

2005-11-21 Thread Ingo Blechschmidt
Hi,

Rob Kinyon wrote:
 On 11/20/05, Ingo Blechschmidt [EMAIL PROTECTED] wrote:
 Yep. Also note that for is not a special magical construct in Perl
 6, it's a simple subroutine (statement_control:for, with the
 signature ([EMAIL PROTECTED], Code *code)). (Of course, it'll usually be
 optimized.)

 Example:

 {
 my sub statement_control:for ([EMAIL PROTECTED], Code *code) {
 map code, reverse @array;
 }

 for a b c - $item { say $item }
 # c\nb\na\n
 }

 # for restored, as the modified for went out of scope:
 for a b c - $item { say $item }
 # a\nb\nc\n
 
 Is there a list of the statement control items that are implemented as
 such vs. implemented in another way?

statement_control:if,
statement_control:unless,
statement_control:for,
statement_control:while,
statement_control:until, and
statement_control:loop

come to my mind.

??!! is proably defined as

sub ternary:?? !! ($cond, $then is lazy, $else is lazy) {
if $cond { $then } else { $else }
}

(Assuming that ternary is the correct grammatical category and is
lazy DWIMs.)

Of course, the compiler is free to optimize these things if it can prove
that runtime's statement_control:if is the same as the internal
optimized statement_control:if.


--Ingo



Re: till (the flipflop operator, formerly ..)

2005-11-21 Thread Ingo Blechschmidt
Hi,

Larry Wall wrote:
 On Sun, Nov 20, 2005 at 08:51:03PM +0100, Ingo Blechschmidt wrote:
 : according to the new S03, till is the new name for the flipflop
 : operator.
 
 Presuming we can make it work out as an infix macro.

Ah, it's a macro. This clarifies things.

 : Do the flipflop operators of subroutines maintain own
 : per-invocation-of-the-sub states? I.e.:
 : 
 : sub foo (x) { x() till 0 }
 : 
 : foo { 0 };  # evaluates to a false value, of course
 : 
 : foo { 1 };  # evaluates to a true value, of course
 : foo { 0 };
 : # still true?
 : #   (Argumentation: The flipflop is in the true state,
 : #   so the LHS is not evaluated.)
 : # Or is it false?
 : #   (Argumentation: The flipflop operator of the previous
 : #   invocation is not the flipflop operator of the current
 : #   invocation, so the return value is false.)
 
 It's still true.  Ignoring the E0 issue, the desugar of A till B
 is something like:
[...]

Thanks very much, this code is very clear. :)

 : Also, all operators can be called using the subroutine form (which
 : is a very good thing), e.g.:
 : 
 : say infix:-(42, 19);  # 23
 : 
 : Is this true for till as well?
 : 
 : say infix:till(LHS, RHS);
 
 Probably not.  Calling macros as functions is a bit of a problem.

Yep. (I assumed infix:till would be an ordinary subroutine.)

 : But how would infix:till maintain the state then, as no explicit
 : ID is passed to it? Does infix:till access an internal %states
 : hash, using $CALLER::POSITION as keys?
 
 That feels like a hack to me.  I'd rather find a way of poking a real
 state variable into the caller's scope if we have to support that.

Agreed. The desugar you provides feels far more sane.

 : Perl 5's flipflop operator appends E0 to the final sequence number
 : in a range, allowing searches for /E/. My guess is that this is
 : superseded by $sequence_number but
 : this_is_the_endpoint_of_the_range (you get the idea). Correct?
 
 I was just thinking that you'd use till^ if you wanted to exclude the
 endpoint.  And ^till to exclude the beginning, and ^till^ to exclude
 both, just as with ..^, ^.., and ^..^.

Ok.

 In fact, that's really my main motivation for wanting it to be infix.
 Otherwise it might as well be an ordinary flipflip() macro, or
 fromto().

Makes sense.


--Ingo



Re: Multidimensional argument list binding (*@;foo)

2005-11-21 Thread Luke Palmer
On 11/21/05, Ingo Blechschmidt [EMAIL PROTECTED] wrote:
 Hm. How is (*@;AoA) different from (Array [EMAIL PROTECTED]) then? (Assuming 
 that
 foo(@a; @b) desugars to foo([EMAIL PROTECTED], [EMAIL PROTECTED]).)

Well, it's not at all, under that assumption.  But that assumption is
wrong.  I think foo(@a; @b) doesn't have a sugar-free form (that is to
say, it is the sugar-free form).  Among things that desugar to it:

@a == foo() == @b
foo(@a) == @b
@a == @b == foo()   # maybe; don't remember

To illustrate:

sub foo ([EMAIL PROTECTED]) {
say [EMAIL PROTECTED];
}
sub bar (*@;a) {
say +@;a;
}
foo(1,2,3; 4,5,6);   # 6
bar(1,2,3; 4,5,6);   # 2

That is, the regular [EMAIL PROTECTED] has concat semantics.  However, I'd 
like to
argue that it should have die semantics, for obvious reasons.

Luke


Re: statement_controlfoo() (was Re: lvalue reverse and array views)

2005-11-21 Thread Luke Palmer
On 11/21/05, Ingo Blechschmidt [EMAIL PROTECTED] wrote:
 Of course, the compiler is free to optimize these things if it can prove
 that runtime's statement_control:if is the same as the internal
 optimized statement_control:if.

Which it definitely can't without some pragma.

I wonder if they should be macros.  (Macros that would by default
expand to things that aren't expressible in Perl 6)

Luke


Re: \x{123a 123b 123c}

2005-11-21 Thread Patrick R. Michaud
On Mon, Nov 21, 2005 at 03:23:35PM +0100, TSa wrote:
 Patrick R. Michaud wrote:
 There's also sp, unless someone redefines the sp subrule.
 And in the general case that's a slightly more expensive mechanism 
 to get a space (it involves at least a subrule lookup).  Perhaps 
 we could also create a visible meta sequence for it, in the same 
 way that we have visible metas for \e, \f, \r, \t.  But I have 
 no idea what letter we might use there.
 
 How about \x and \X respectively? Note the *space* after it :)
 ...

If we're going to do that, I'd think it would be \c  and \C  
instead of \x  and \X .  I'm not really advocating this,
I'm just commenting that in this case \c seems more natural 
than \x.

Pm


apo5 (was: Re: \x{123a 123b 123c})

2005-11-21 Thread Ruud H.G. van Tol
Larry Wall:
 Juerd:
 Ruud:

 Maybe
 \x{123a 123b 123c}
 is a nice alternative of
 \x{123a} \x{123b} \x{123c}.

 Hmm, very cute and friendly! Can we keep it, please? Please?

Thanks for the support.


 We already have, from A5, \x[0a;0d], so you can supposedly say
 \x[123a;123b;123c]

rereading apo5 /
Found it in the old/new table on page 7. For me the semicolon is fine.

I am using character names more and more, and between those, semicolons
are less cluttery. Character names can contain spaces, but semicolons
too? If not then
\c[BEL; EXTENDED ARABIC-INDIC DIGIT ZERO] would be possible, but maybe
better not, or more like
\c['BEL'; 'EXTENDED ARABIC-INDIC DIGIT ZERO'] or even
\c('BEL', 'EXTENDED ARABIC-INDIC DIGIT ZERO').



Something else:
The '^' could be used for both the ultimate start- and end-of-string.
This frees the '$'.

There is still the '$$' that matches before embedded newlines, and since
'^^' matches after those newlines, the '^^' and '$$' can only be unified
to '^^' if it is one-width inside a string, so is like '[$$\n^^]' (or
just '\n') there.
At start- and end-of-string the '^^' can still be a zero-width match.
I am not sure about greedy (meaning to try one-width first) or
non-greedy.

Example: '^[(\N*)^^]*^' to capture all lines, clean of newlines.
Not a lot clearer than '^[(\N*)\n*]*$', but freeing the '$' and '$$'
might be worth it.

mess about '^^+', '^+^' and '^*^' (bats!) removed

-- 
Affijn, Ruud

Gewoon is een tijger.



Re: \x{123a 123b 123c}

2005-11-21 Thread Larry Wall
On Sun, Nov 20, 2005 at 10:27:17AM -0600, Patrick R. Michaud wrote:
: On Sat, Nov 19, 2005 at 06:32:17PM -0800, Larry Wall wrote:
:  On Sun, Nov 20, 2005 at 01:26:21AM +0100, Juerd wrote:
:  : Ruud H.G. van Tol skribis 2005-11-20  1:19 (+0100):
:  :  Maybe 
:  :  \x{123a 123b 123c} 
:  :  is a nice alternative of 
:  :  \x{123a} \x{123b} \x{123c}. 
:  
:  We already have, from A5, \x[0a;0d], so you can supposedly say 
:  \x[123a;123b;123c] 
: 
: Hmm, I hadn't caught that particular syntax in A05.  AFAIK it's not 
: in S05, so I should probably add it, or whatever syntax we end up 
: adopting.

Yes.

: (BTW, we haven't announced it on p6l yet, but there's a new version of
: S05 available.)

Indeed, there are new versions of most of the S's.  People who want the
latest should use svn.perl.org, which also makes it easy to do diff listings
with svn or svk.

:  [...]
:  But I see that the semicolon is rather cluttery, mainly because it's
:  too tall.  I'm not sure going all the way to space is good, but we
:  might have
:  \x[123a,123b,123c] 
:  just to get a little visual space along with the separator.  
: 
: Just to verify, with this syntax would we expect
: 
: \x[123a,123b,123c]+
: 
: to be the same as
: 
: [\x123a \x123b \x123c]+
: 
: and not \x123a \x123b \x123c+ ?

Yes.  I think the rule interpretation of \x is that it is a sequence to
be considered a single character regardless of its context.  Certainly
the square brackets we've mandated would tend to read as grouping anyway.

Of course, the main point of the \x[a,b,c] notation is to allow
interpolation of sequences of hex characters into ordinary strings,
and those don't care about abstract character boundaries.

:  It occurs to me that we didn't spec whether character classes ignore
:  whitespace.  They probably should, just so you can chunk things:
:  
:  / [ a..z A..Z 0..9 _ ] /
:  
:  Then the question arises about whether [ \ ] is an escaped space
:  or a backslash, or illegal  
: 
: I vote that it's an escaped space.  A backslash is nearly always \\
: (or should be imho).
: 
:  But if we make it match a backslash
:  or illegal, then the minimal space matcher becomes \x20, I think,
:  unless you graduate to \s.  On the other hand, if we make it match
:  a space, people aren't going to read that way unless they're pretty
:  sophisticated...
: 
: There's also sp, unless someone redefines the sp subrule.

But you can't use sp in a character class.  Well, that is, unless
you write it:

+[ a..z ]+sp

or some such.  Maybe that's good enough.

: And in the general case that's a slightly more expensive mechanism 
: to get a space (it involves at least a subrule lookup).  Perhaps 
: we could also create a visible meta sequence for it, in the same 
: way that we have visible metas for \e, \f, \r, \t.  But I have 
: no idea what letter we might use there.

Something to be said for \_ in that regard.

: I don't think I like this, but perhaps  C   becomes ?null 
: and Cbecomes ' '?  Seems like not enough visual distinction
: there...

_ maybe.  I'm good with  being ?null, and , being element boundary
when matching lists.  But I'd like to reserve   for delimiting what
is returned by $, the string officially matched:

foo bar baz ~~ /:w foo  \w+  baz/
say $/; # foo bar baz
say $;# bar

Or possibly

foo bar baz ~~ /:w foo  \w+  baz/

but that should probably mean whatever

foo bar baz ~~ /:w foo « \w+ » baz/

eventually means.  Which I haven't the foggiest.  But we should probably
reserve the brackets on general principle's sake, just because brackets
are so scarce.

I dunno.  If «...» in ordinary code does shell quoting, maybe «...» in
rules does filename globbing or some such.  I can see some issues with
anchoring semantics.  Makes more sense on a string as a whole, but maybe
can anchor on element boundaries if used on a list of filenames.
I suppose one could even go as far as

rule jpeg :i « *.jp{e,}g »

or whatever the right glob syntax is.

Larry


Re: apo5 (was: Re: \x{123a 123b 123c})

2005-11-21 Thread Larry Wall
On Mon, Nov 21, 2005 at 05:49:59PM +0100, Ruud H.G. van Tol wrote:
: Larry Wall:
:  Juerd:
:  Ruud:
: 
:  Maybe
:  \x{123a 123b 123c}
:  is a nice alternative of
:  \x{123a} \x{123b} \x{123c}.
: 
:  Hmm, very cute and friendly! Can we keep it, please? Please?
: 
: Thanks for the support.

Hey, this ain't exactly a popularity contest here...  :-)

:  We already have, from A5, \x[0a;0d], so you can supposedly say
:  \x[123a;123b;123c]
: 
: rereading apo5 /
: Found it in the old/new table on page 7. For me the semicolon is fine.

The fact that you say page 7 leads me to guess that you're reading
it from perl.com.  That's going to be the most out-of-date version.
Better would be

dev.perl.orgone day latency but html-ified
svn.perl.orgup to the minute but only in pod

In particular, the Apocalypses have little [Update:] sections that are
supposed to alert you to things that have changed since the the Apo
was written.  (Though some of those are a little out of date right now
too--I'm just working my way through A12 again.)

: I am using character names more and more, and between those, semicolons
: are less cluttery. Character names can contain spaces, but semicolons
: too? If not then
: \c[BEL; EXTENDED ARABIC-INDIC DIGIT ZERO] would be possible, but maybe
: better not, or more like
: \c['BEL'; 'EXTENDED ARABIC-INDIC DIGIT ZERO'] or even
: \c('BEL', 'EXTENDED ARABIC-INDIC DIGIT ZERO').

None of the current names contain either semicolon or comma, so I expect
they're avoiding them by policy.

: Something else:
: The '^' could be used for both the ultimate start- and end-of-string.
: This frees the '$'.

I think this is one of those aspects of regex culture that is too
entrenched to remove.  Besides, you have to be able to distinguish
s/^/foo/ from s/$/foo/.

: There is still the '$$' that matches before embedded newlines, and since
: '^^' matches after those newlines, the '^^' and '$$' can only be unified
: to '^^' if it is one-width inside a string, so is like '[$$\n^^]' (or
: just '\n') there.

But then if you use it within a capture, you get an extra newline you
probably don't want.

: At start- and end-of-string the '^^' can still be a zero-width match.
: I am not sure about greedy (meaning to try one-width first) or
: non-greedy.
: 
: Example: '^[(\N*)^^]*^' to capture all lines, clean of newlines.
: Not a lot clearer than '^[(\N*)\n*]*$', but freeing the '$' and '$$'
: might be worth it.

I don't think it's any clearer.  In fact, I find all the ^'s there
are a little too visually confusing and contextual.

Larry


Re: \x{123a 123b 123c}

2005-11-21 Thread Larry Wall
On Mon, Nov 21, 2005 at 09:02:57AM -0800, Larry Wall wrote:
: But I'd like to reserve   for delimiting what is returned by $,
: the string officially matched:
: 
: foo bar baz ~~ /:w foo  \w+  baz/
: say $/;   # foo bar baz
: say $;  # bar

Though it occurs to me that there's another possible interpretation,
culturally speaking.  The overloading of \b has always bothered me,
plus the fact that \b can't distinguish which kind of word boundary
without additional context.  In regex culture, we have the \...\
word matcher, and maybe that devolves to isolated  ...  in rules.

We could still use  ...  to capture $, which I was leaning toward
anyway just for visibility reasons, since the two ends could be quite
far apart.

And file globbing could just be :glob or some such if we really need
to embed it in rules.

Larry


Re: statement_controlfoo() (was Re: lvalue reverse and array views)

2005-11-21 Thread TSa

HaloO,

Luke Palmer wrote:

On 11/21/05, Ingo Blechschmidt [EMAIL PROTECTED] wrote:


Of course, the compiler is free to optimize these things if it can prove
that runtime's statement_control:if is the same as the internal
optimized statement_control:if.



Which it definitely can't without some pragma.


Isn't the question just 'when'? I think at the latest it could be
optimized JIT before the first execution, or so. The relevant AST
branch stays for later eval calls which in turn branch off the
sourrounding module's version from within the running system such
that the scope calling the eval sees the new version. And this in
turn might be optimzed and found unchanged in its optimized form.

Sort of code morphing of really first class code. Everything else
makes closures second class ;)
--


Re: statement_controlfoo() (was Re: lvalue reverse and array views)

2005-11-21 Thread Larry Wall
On Mon, Nov 21, 2005 at 03:51:19PM +, Luke Palmer wrote:
: On 11/21/05, Ingo Blechschmidt [EMAIL PROTECTED] wrote:
:  Of course, the compiler is free to optimize these things if it can prove
:  that runtime's statement_control:if is the same as the internal
:  optimized statement_control:if.
: 
: Which it definitely can't without some pragma.

But remember that on some level or other, all declarations function as
pragmas.  So the absence of a redeclaration of if could be taken as
a kind of pragma, if we require control redefinition to be lexically
scoped, which we probably should.

: I wonder if they should be macros.  (Macros that would by default
: expand to things that aren't expressible in Perl 6)

Which is another way of saying that control redefinitions should be
lexically scoped, since macros are required to do lexically scoped
syntax modification unless they're Preluditudinous.

Another issue in if optimization is whether the blocks in fact do
anything blockish that have to be scoped to the block.  This is a
determination that Perl 5 makes when it's compiling blocks.  It's
basically an attribute that migrates up the tree from the leaves, which
are mostly true, but anyone in the block can falsify the attribute
for the block as a whole.

Arguably, when you use ??!! and friends, it should also be doing such
analysis on the lazy bits and telling you that your my is badly
scoped if it's in conditional code.  That also catches

my $x = 0 if rand 2;

Larry


Re: statement_controlfoo() (was Re: lvalue reverse and array views)

2005-11-21 Thread Larry Wall
On Mon, Nov 21, 2005 at 10:45:56AM -0800, Larry Wall wrote:
: Another issue in if optimization is whether the blocks in fact do
: anything blockish that have to be scoped to the block.  This is a
: determination that Perl 5 makes when it's compiling blocks.  It's
: basically an attribute that migrates up the tree from the leaves, which
: are mostly true, but anyone in the block can falsify the attribute
: for the block as a whole.

Actually, I said that backwards.  It starts out false and gets truified
if anyone says Yes, we gotta have a block around us.

Larry


Re: Multidimensional argument list binding (*@;foo)

2005-11-21 Thread Ingo Blechschmidt
Hi,

Luke Palmer wrote:
 On 11/21/05, Ingo Blechschmidt [EMAIL PROTECTED] wrote:
 Hm. How is (*@;AoA) different from (Array [EMAIL PROTECTED]) then? (Assuming 
 that
 foo(@a; @b) desugars to foo([EMAIL PROTECTED], [EMAIL PROTECTED]).)
 
 Well, it's not at all, under that assumption.  But that assumption is
 wrong.

Aha! FYI, I got that interpretation from r6628 of S09 [1]:
 The following two constructs are structurally indistinguishable:
 
 (0..10; 1,2,4; 3)
 ([0..10], [1,2,3,4], [3])

 I think foo(@a; @b) doesn't have a sugar-free form (that is to
 say, it is the sugar-free form).  Among things that desugar to it:
 
 @a == foo() == @b
 foo(@a) == @b
 @a == @b == foo()   # maybe; don't remember
 
 To illustrate:
 
 sub foo ([EMAIL PROTECTED]) {
 say [EMAIL PROTECTED];
 }
 sub bar (*@;a) {
 say +@;a;
 }
 foo(1,2,3; 4,5,6);   # 6
 bar(1,2,3; 4,5,6);   # 2
 
 That is, the regular [EMAIL PROTECTED] has concat semantics.  However, I'd 
 like to
 argue that it should have die semantics, for obvious reasons.

Just to clarify -- only ; with *@;a should have die semantics, ,
with *@;a should continue to work, right? (If so, I agree.)

Could you provide some more examples with ;, please? In particular, what
are the results of the following expressions?

(42; 23)
(@a; @b)
(@a; @b)[0]
(@a; @b)[0][0]

((42;23); (17;19))
((@a;@b); (@c;@d))

*(42; 23)
*(@a; @b)

( (42; 23), 19)
(*(42; 23), 19)

[42; 23]
[EMAIL PROTECTED]; @b]


Thanks very much,

--Ingo

[1] http://svn.perl.org/perl6/doc/trunk/design/syn/S09.pod
/The semicolon operator



Re: apo5

2005-11-21 Thread Ruud H.G. van Tol
Larry Wall:
 Ruud H.G. van Tol:


 dev.perl.org one day latency but html-ified
 svn.perl.org up to the minute but only in pod

Thanks, much better. Can't say that I haven't been there before.

There is a [[:alpha:][:digit:] and a [[:alpha:][:digit]] on the
A5-page.


 The '^' could be used for both the ultimate start- and end-of-string.
 This frees the '$'.

 I think this is one of those aspects of regex culture that is too
 entrenched to remove.

Yes, I have experienced that with some of my procmail-recipes that use
'^' to match embedded newlines.
In procmail the '^^' matches begin- or end-of-string. Both a '^' and a
'$' can be used to match a real or putative newline. Some people
replaced my '^'s with '$'s.

OK, everybody can stop reading here, no serious attempts below.

Within C++, there is a much smaller and cleaner language struggling to
get out, which would ... have been an unimportant cult language.
(Bjarne Stroustrup, The Design and Evolution of C++).


 Besides, you have to be able to distinguish
 s/^/foo/ from s/$/foo/.

's/$/foo/' becomes 's/after .*/foo/'
g


 There is still the '$$' that matches before embedded newlines, and
 since '^^' matches after those newlines, the '^^' and '$$' can only
 be unified to '^^' if it is one-width inside a string, so is like
 '[$$\n^^]' (or just '\n') there.

 But then if you use it within a capture, you get an extra newline you
 probably don't want.

Place the ^^ outside the ().

I wasn't sure about the default for the greediness of '^^' at begin- or
end-of-string, I guess non-greediness can be arranged with a trailing
'?'.


 At start- and end-of-string the '^^' can still be a zero-width match.
 I am not sure about greedy (meaning to try one-width first) or
 non-greedy.

 Example: '^[(\N*)^^]*^' to capture all lines, clean of newlines.
 Not a lot clearer than '^[(\N*)\n*]*$', but freeing the '$' and '$$'
 might be worth it.

 I don't think it's any clearer.

Pardon my Dutch, I didn't find it clearer either (but, might be worth
it).


 In fact, I find all the ^'s there
 are a little too visually confusing and contextual.

/^  # BoS
   [# start of non-capturing group
 (\N*)  # capture a substring of non-newlines
 ^^ # newline or EoS
   ]*   # end of non-capturing group, repeat
 ^/x# EoS

As I just said, I am used to '^^' as start- and end-of-buffer, and '^'
as matching a real or putative newline, because of procmail.

-- 
Grtz, Ruud



Re: statement_controlfoo() (was Re: lvalue reverse and array views)

2005-11-21 Thread Rob Kinyon
On 11/21/05, TSa [EMAIL PROTECTED] wrote:
 HaloO,

 Luke Palmer wrote:
  On 11/21/05, Ingo Blechschmidt [EMAIL PROTECTED] wrote:
 
 Of course, the compiler is free to optimize these things if it can prove
 that runtime's statement_control:if is the same as the internal
 optimized statement_control:if.
 
 
  Which it definitely can't without some pragma.

 Isn't the question just 'when'? I think at the latest it could be
 optimized JIT before the first execution, or so. The relevant AST
 branch stays for later eval calls which in turn branch off the
 sourrounding module's version from within the running system such
 that the scope calling the eval sees the new version. And this in
 turn might be optimzed and found unchanged in its optimized form.

 Sort of code morphing of really first class code. Everything else
 makes closures second class ;)

This is very close to a proposal I made to the ruby-dev mailing list
(which was Warnocked). I proposed a very basic engine that would work
with the parser/lexer to determine what action to take instead of
using the huge case statements that are the heart of both P5 and Ruby.
It would look something like:

TOKEN:
while ( my $token = get_next_token(params) ) {
for my $length ( reverse length($token) .. 1 ) {
if ( my $actions = find_actions( substr( $token, 0, $length ) ) ) {
$action-[-1]-(  params, if necessary  );
}
next TOKEN;
}
throw SyntaxError;
}

The for-loop + substr() would be to handle longest-token-first rules.
So, ... is correctly recognized instead of handled as .. and ..
The key would be that the $actions arrayref would get push'ed/pop'ed
as you enter/leave a given lexical scope.

Obviously, this could be optimized to an extremely large degree, but
it -should- work.

Rob


Re: Multidimensional argument list binding (*@;foo)

2005-11-21 Thread Larry Wall
On Sun, Nov 20, 2005 at 09:11:33PM +0100, Ingo Blechschmidt wrote:
: Also, is specifying other, non-slurpy arguments prior to a slurpy
: @;multidim_arglist legal?

Yes, though we have to be careful about what happens when we bind the
entire first dimension and then get a == boundary.  That's probably
not intended to produce an empty first dimension.  On the other hand,
maybe it just falls out of the policy that .[] is a null dimension
unless you actually put something there, and maye that extends to
.[;stuff] and even .[stuff].

: E.g.:
: 
: sub bar ($normal_var, @;AoA) {...}
: bar 42, @array1, @array2;
: # $normal_var is 42,
: # @AoAis ([EMAIL PROTECTED], [EMAIL PROTECTED])
: # Correct?

No, these are specifically not AoA.

: The existence of a @array variable does not imply the existence of a
: @;array variable, right?

I think it probably does, or should.  @;array is sugar for something
like [;[EMAIL PROTECTED], presuming that .specs clumps its slices into
single iterators (which it doesn't), and also presuming that you could
use [;] in a declarative context (which you can't).  So it's more like
[;[EMAIL PROTECTED], which is an array of slice generators each of which 
is a sublist of iterators.  It's probably not an array of arrays
internally, but just a list of specs with some of them marked as starting
a new dimension.

We originally were modeling the multidimension stuff on AoA, but we
kept getting tangled up in intentional vs unintentional brackets.
We need to be able to support the userland flat view of an array and
still be able to get at its specs anyway, so this is basically trying
to handle multislices/multidims/multipipes with the same underlying
mechanism.

Larry


Re: Multidimensional argument list binding (*@;foo)

2005-11-21 Thread Larry Wall
On Mon, Nov 21, 2005 at 03:48:30PM +, Luke Palmer wrote:
: To illustrate:
: 
: sub foo ([EMAIL PROTECTED]) {
: say [EMAIL PROTECTED];
: }
: sub bar (*@;a) {
: say +@;a;
: }
: foo(1,2,3; 4,5,6);   # 6
: bar(1,2,3; 4,5,6);   # 2
: 
: That is, the regular [EMAIL PROTECTED] has concat semantics.  However, I'd 
like to
: argue that it should have die semantics, for obvious reasons.

Well, that can be argued both ways.  The Unix shells get along very well
with default concat semantics, thank you:

(echo foo; echo bar; echo baz) | grep a

And it's rather Perlish to give you a level of flattening for free when
it comes to lists.  And I'd like to be able to distinguish:

my @foo := gather {
for @whatever {
take .generate();
}
}

from

my @;foo := gather {
for @whatever {
take .generate();
}
}

though I think maybe I'm arguing that the ; there is just documentation
if @;foo and @foo are really the same variable, and it's the differing
usage in rvalue context that desugars @;foo to [;]foo.dims.

Larry


Re: Multidimensional argument list binding (*@;foo)

2005-11-21 Thread Larry Wall
On Mon, Nov 21, 2005 at 07:49:16PM +0100, Ingo Blechschmidt wrote:
: Aha! FYI, I got that interpretation from r6628 of S09 [1]:
:  The following two constructs are structurally indistinguishable:
:  
:  (0..10; 1,2,4; 3)
:  ([0..10], [1,2,3,4], [3])

Sorry, started revising that one a couple days ago and got sidetracked...

Larry


Re: statement_controlfoo() (was Re: lvalue reverse and array views)

2005-11-21 Thread Larry Wall
On Mon, Nov 21, 2005 at 02:05:31PM -0500, Rob Kinyon wrote:
: This is very close to a proposal I made to the ruby-dev mailing list
: (which was Warnocked). I proposed a very basic engine that would work
: with the parser/lexer to determine what action to take instead of
: using the huge case statements that are the heart of both P5 and Ruby.
: It would look something like:
: 
: TOKEN:
: while ( my $token = get_next_token(params) ) {
: for my $length ( reverse length($token) .. 1 ) {
: if ( my $actions = find_actions( substr( $token, 0, $length ) ) ) {
: $action-[-1]-(  params, if necessary  );
: }
: next TOKEN;
: }
: throw SyntaxError;
: }
: 
: The for-loop + substr() would be to handle longest-token-first rules.
: So, ... is correctly recognized instead of handled as .. and ..
: The key would be that the $actions arrayref would get push'ed/pop'ed
: as you enter/leave a given lexical scope.
: 
: Obviously, this could be optimized to an extremely large degree, but
: it -should- work.

Let's see, where did I put my stash of generic quotes?  Ah, there is is.

Those who do not understand XXX are doomed to reinvent it, poorly.
~~ s/XXX/the Perl 6 grammar engine/;

In particular, you've just reinvented the magic hash semantics mandated
by P6 rules, except that P6 has some hope of optimizing the lookups
to a cached trie or whatever.

Larry


Re: apo5

2005-11-21 Thread Larry Wall
On Mon, Nov 21, 2005 at 07:57:59PM +0100, Ruud H.G. van Tol wrote:
: There is a [[:alpha:][:digit:] and a [[:alpha:][:digit]] on the
: A5-page.

Hmm, well, thanks--I went to fix it and I see Patrick beat me to
the fix.  But in one of the updates, it says:

+[Update: Actually, that's now written C +alpha+digit , avoiding
+the mistaken impression entirely.]

And it occurs to me that we could probably allow alpha+digit there
since there's no ambiguity what alpha means, and we're already claiming
the next character after the opening word to decide how to process the
rest of the text inside angles.  Even if someone writes

alpha + digit

that would fail under the current policy of treating + digit as rule,
since you can't start a rule with +.

Unfortunately, though,

identchar - digit

would be ambiguous, and/or wrong.  Could allow whitespace there if we
picked an explicit this is rule character.  Did we remove this is
string?  If so, we could swipe the colon:

after: --help

Could put back this is string with explicit quotes:

after '--help'

but that doesn't save much over

after('--help')

which is partly why we removed this is string in the first place.

Larry


Ponie Inquiry

2005-11-21 Thread Joshua Gatcomb
All:
Back in the summer of 2003, Fotango offered financial support for Ponie
development for 2 years. Nicholas took up the development hat after Arthur,
but things are awfully quiet. Since summer 2005 has come and gone, I wonder
if funding has been extended. I know that Nicholas opened up the repository
to the public http://use.perl.org/~nicholas/journal/24649 but is anyone else
working on the project? With the excitement of Perl6, Parrot, and Pugs I
wonder if Ponie is being neglected.

Inquiring minds want to know.

Cheers,
Joshua Gatcomb
a.k.a. Limbic~Region


Re: apo5

2005-11-21 Thread Ruud H.G. van Tol
Larry Wall:

 in one of the updates, it says:

 +[Update: Actually, that's now written C +alpha+digit ,
 avoiding +the mistaken impression entirely.]

In dev's A05.html I only found:
[Update: That must now be written +alpha+digit, or it will be
mistaken for «alphadigit», which doesn't work too well.].

I see those character classes as infinite sort-of-binary masks, so
alpha|digit looks right to me.
Idem [_] | alpha | digit  !Swedish, with left-to-right application.
(I don't oversee the consequences.)

-- 
Grtz, Ruud (sober.u is on the loose)



Re: statement_controlfoo() (was Re: lvalue reverse and array views)

2005-11-21 Thread Larry Wall
On Mon, Nov 21, 2005 at 11:43:21AM -0800, Larry Wall wrote:
: Let's see, where did I put my stash of generic quotes?

I would like to publicly apologize for my remarks, which were far too
harsh for the circumstances.  I can only plead that I was trying to
be far too clever, and not thinking about how it would come across.
No, to be perfectly honest, it was more culpable than that.  I had
a niggling feeling I was being naughty, and I ignored it.  Shame on me.
I will try to pay better attention to my conscience in the future.

Larry


Re: apo5

2005-11-21 Thread Ruud H.G. van Tol
Patrick R. Michaud:

 's/$/foo/' becomes 's/after .*/foo/'
 g
 
 Uh, no, because after is still a zero width assertion.  :-)


That's why I chose it. It is not at the end-of-string?

  perl5 -e '$_=abc; s/(?=...)/x/; print'

  perl5 -e '$_=abc; s/(?!.)/x/; print'

  's/!before ./foo/'

-- 
Grtz, Ruud


Re: apo5

2005-11-21 Thread Juerd
Larry Wall skribis 2005-11-21 12:08 (-0800):
 Unfortunately, though,
 identchar - digit
 would be ambiguous, and/or wrong. 

Well, we could of course change - to mean -1 or fewer, as + means
+1 or more... :D


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


Re: apo5

2005-11-21 Thread Ruud H.G. van Tol
Patrick R. Michaud:
 Ruud H.G. van Tol:
 Patrick R. Michaud:
 Ruud H.G. van Tol:

 's/$/foo/' becomes 's/after .*/foo/'

 Uh, no, because after is still a zero width assertion.  :-)

 That's why I chose it. It is not at the end-of-string?

 Because .* matches , /after .*/ would be true at
 every position in the string, including the beginning,
 and this is where foo would be substituted.

I expected greediness, also because after .*? could behave non-greedy.

Just like:
   s/(.*)/$1foo/
   s/(.*?)/$1foo/

OK, so 's/!before ./foo/' it must be.

But why does after .* behave non-greedy?

-- 
Grtz, Ruud



Perl 6 Summary for 2005-11-14 through 2005-11-21

2005-11-21 Thread Matt Fowles
Perl 6 Summary for 2005-11-14 through 2005-11-21
All~

Welcome to another Perl 6 Summary. The attentive among you may notice
that this one is on time. I am not sure how that happened, but we will
try and keep it up. On a complete side note, I think there should be a
Perl guild of some sort on World of Warcraft. It should probably be
horde if there is, both because I hate the alliance and because it fits
better.

  Perl 6 Language
As usual for Pugs, most development continued off list.

http://xrl.us/iipt

   Too Lazy?
Luke Palmer posted a problem he was having with pugs. Warnock applies
(which likely means it was made into a test and fixed).

http://xrl.us/iipu

   Assigning to Named Subrules
Jerry Gay had a question about the semantics of assigning to named
subrules in PGE. Patrick explained that it created an array of capture
objects.

http://xrl.us/iipv

   Keyed Access to Match Objects
Jerry Gay was having trouble with keyed access to match objects. After
some discussion he implemented the keyed routine he needed and
threatened to implement a few more.

http://xrl.us/iipw

   PGE Now  compreg s
Patrick announced that PGE was now a better citizen in the parrot world,
using compreg to locate the compiler instead of find_global.

http://xrl.us/iipx

  Parrot
I am going to get an English muffin. More in a moment... much better.
Peanut butter is a wonderful thing. Where was I?

   Character Classes Done
Jerry Gay wondered if the TODO about strings and character classes was
still open. Patrick said it was resolved and should be closed.

http://xrl.us/iipy

   rx_grammar.pl Progress?
Jerry Gay wondered if rx_grammar.pl had seen any work lately. Warnock
applies.

http://xrl.us/iipz

   N Registers No Longer Get Whacked
Leo, thanks to his new calling scheme, closed an RT ticket from Dec
2004.

http://xrl.us/iip2

   Report SVN Revision in parrotbug?
Jerry Gay resurrected an old ticket wondering whether to add a revision
field to RT tickets.

http://xrl.us/iip3

   Making Parrot Potable
Florian Ragwitz was having trouble drinking Parrot so he wants to expend
some effort to make it more potable. Apparently it does not get drunk so
well by many machines in debian's build farms and he would like to fix
it. When he asked how best to do his work (so as not to upset to many),
Chip suggested a local SVK mirror. Hopefully after he is done even more
people will be able to enjoy drinking the Parrot kool-aid.

http://xrl.us/iip4

   pbc_merge Requires LINK_DYNAMIC
Nick Glencross provided a patch fixing pbc_merge on HP-UX. François
Perrad noted that it was also problem on Win32. Jonathan Worthington
explained that he was aware of the problem and that the dependency on
the dynamic libraries would soon be removed.

http://xrl.us/iip5

   Compilable Option
Will Coleda wants a -c option which will only tell you if the code is
compilable for Parrot.

http://xrl.us/iip6

   Clerihewsiwhatsit?
Inspired by Piers's inspiration from his name, Roger Browne wrote a
Clerihew. Piers and Roger scare me.

http://xrl.us/iip7

   Debug Segments
There was much discussion about what sort of interface to expose to HLL
for debug segments. It looks like something good will come out of it
all.

http://xrl.us/iip8

   Amber for Parrot version 0.3.1
Announced, Roger Browne displaying Amber 0.3.1 aroun': this latest
version, magic cookie, is more than just a rookie.

http://xrl.us/iip9

   t/library/streams.t Failing
Patrick is still having trouble with t/library/streams.t. It sounds like
he would appreciate help, but Warnock applies.

http://xrl.us/iiqa

   PGE::glob Issues
Will Coleda spotted a problem with PGE::Glob. Patrick fixed it.

http://xrl.us/iiqb

find_word_boundary  Unneeded
Patrick posted his explanation of why find_word_boundary was an unneeded
opcode. Too that end he posted a patch updating t/op/string_cs.t.
Warnock applies to both thoughts.

http://xrl.us/iiqc

http://xrl.us/iiqd

   Coroutines Trample Scratchpads
Nick Glencross noted that coroutine_3.pasm was trampling some memory.
Leo said that scratchpads were on their way out. Nick wondered if the
ticket should be closed now, or when this is fixed. I vote that we not
close tickets until the problem is gone, but Warnock applies.

http://xrl.us/iiqe

   MD5 Broken
Chip noticed that MD5 was horribly broken recently. He decided that
parrot should avoid it in favor of the SHA-2 family and maybe Whirlpool.
If you are a crypto dork, you have your job cut out for you.

http://xrl.us/iiqf

Joshua Hoblitt joyously closed an RT ticket about removing $(MAKE_C).

http://xrl.us/iiqg

   inconsistent dll linkage
Jerry Gay announce that the last MSVC 7.1 

Re: apo5

2005-11-21 Thread Ruud H.G. van Tol
Patrick R. Michaud:
 Ruud H.G. van Tol:

 's/$/foo/' becomes 's/after .*/foo/'

 Uh, no, because after is still a zero width assertion.  :-)

 That's why I chose it. It is not at the end-of-string?

 Because .* matches , /after .*/ would be true at
 every position in the string, including the beginning,
 and this is where foo would be substituted.

 I expected greediness, also because after .*? could behave
 non-greedy. ...
 But why does after .* behave non-greedy?

 I think you may be misreading what after .* does -- it's a
 lookbehind assertion.

No, I was no longer misreading it, I was questioning its rationale. I
wondered what would be lost if the construct would behave more like
's/(.*)/$1foo/'. Sorry for not making that more explicit. I was still
getting rid of the '$'. And monitoring the outbreak of sober.u.


 The greediness of the .* subpattern in after .* doesn't affect
 things at all -- after .* is still a zero-width assertion.

There is a zero-width 'slot' before (and after) each character in the
pattern string. As a zero-width assertion, 'after .*' has no sense, no
'self', since it can't move the match position to another slot.

In 'after ab*', the 'b*' means nothing.
In 'after ab+', the '+' means nothing.
In 'after .*a', the '.*' means nothing.

Unless the meaning of 'after .*a' would be changed to: try the last
'a' first.

-- 
Grtz, Ruud



dis-junctive patterns

2005-11-21 Thread Gaal Yahas
In pugs, r7961:

 my @pats = /1/, /2/;
 say MATCH if 1 ~~ any @pats; # MATCH
 say MATCH if 0 ~~ any @pats; # no match

So far so good. But:

 my $junc = any @pats;
 say MATCH if 1 ~~ $junc; # no match
 say MATCH if 0 ~~ $junc; # no match

Bug? Feature?

-- 
Gaal Yahas [EMAIL PROTECTED]
http://gaal.livejournal.com/