dis-junctive patterns

2005-11-21 Thread Gaal Yahas
In pugs, r7961:

 my @pats = /1/, /2/;
 say "MATCH" if 1 ~~ any @pats; # MATCH
 say "MATCH" if 0 ~~ any @pats; # no match

So far so good. But:

 my $junc = any @pats;
 say "MATCH" if 1 ~~ $junc; # no match
 say "MATCH" if 0 ~~ $junc; # no match

Bug? Feature?

-- 
Gaal Yahas <[EMAIL PROTECTED]>
http://gaal.livejournal.com/


Re: apo5

2005-11-21 Thread Ruud H.G. van Tol
Patrick R. Michaud:
> Ruud H.G. van Tol:

>> 's/$/foo/' becomes 's//foo/'
>
> Uh, no, because  is still a zero width assertion.  :-)

 That's why I chose it. It is not at the end-of-string?
>>>
>>> Because ".*" matches "", // would be true at
>>> every position in the string, including the beginning,
>>> and this is where "foo" would be substituted.
>>
>> I expected greediness, also because  could behave
>> non-greedy. ...
>> But why does  behave non-greedy?
>
> I think you may be misreading what  does -- it's a
> lookbehind assertion.

No, I was no longer misreading it, I was questioning its rationale. I
wondered what would be lost if the construct would behave more like
's/(.*)/$1foo/'. Sorry for not making that more explicit. I was still
getting rid of the '$'. And monitoring the outbreak of sober.u.


> The greediness of the .* subpattern in  doesn't affect
> things at all --  is still a zero-width assertion.

There is a zero-width 'slot' before (and after) each character in the
pattern string. As a zero-width assertion, '' has no sense, no
'self', since it can't move the match position to another slot.

In '', the 'b*' means nothing.
In '', the '+' means nothing.
In '', the '.*' means nothing.

Unless the meaning of '' would be changed to: try the last
'a' first.

-- 
Grtz, Ruud



[svn ci] Perl 5 tests for PGE::P5Regexp

2005-11-21 Thread jerry gay
i've checked in a subset of Perl 5.9.2's regexp tests for PGE to chew
on. for now, i modified the stolen harness to emit PIR. the harness is
currently very ugly... that won't be for long, however, as i'll
refactor it soon.

currently, only 130 of 960 tests are running, as the PIR-producing
harness can't cope with some of the test file syntax that the
Perl-producing harness used. this will be corrected as well. i noticed
when i don't cut off the tests at 130, i get some runaway parrot
processes which eat all memory and cpu. i'll attempt to isolate the
tests that produce this behavior so i can determine the cause.

the files can be found at t/compilers/pge/p5regexp/ in r10133, and are
run as part of 'make test'. you can run them in isolation by typing
prove t/compilers/pge/p5regexp/

as always, bug and smoke reports are most welcome. enjoy.
~jerry


Perl 6 Summary for 2005-11-14 through 2005-11-21

2005-11-21 Thread Matt Fowles
Perl 6 Summary for 2005-11-14 through 2005-11-21
All~

Welcome to another Perl 6 Summary. The attentive among you may notice
that this one is on time. I am not sure how that happened, but we will
try and keep it up. On a complete side note, I think there should be a
Perl guild of some sort on World of Warcraft. It should probably be
horde if there is, both because I hate the alliance and because it fits
better.

  Perl 6 Language
As usual for Pugs, most development continued off list.



   Too Lazy?
Luke Palmer posted a problem he was having with pugs. Warnock applies
(which likely means it was made into a test and fixed).



   Assigning to Named Subrules
Jerry Gay had a question about the semantics of assigning to named
subrules in PGE. Patrick explained that it created an array of capture
objects.



   Keyed Access to Match Objects
Jerry Gay was having trouble with keyed access to match objects. After
some discussion he implemented the keyed routine he needed and
threatened to implement a few more.



   PGE Now " compreg "s
Patrick announced that PGE was now a better citizen in the parrot world,
using compreg to locate the compiler instead of find_global.



  Parrot
I am going to get an English muffin. More in a moment... much better.
Peanut butter is a wonderful thing. Where was I?

   Character Classes Done
Jerry Gay wondered if the TODO about strings and character classes was
still open. Patrick said it was resolved and should be closed.



   rx_grammar.pl Progress?
Jerry Gay wondered if rx_grammar.pl had seen any work lately. Warnock
applies.



   N Registers No Longer Get Whacked
Leo, thanks to his new calling scheme, closed an RT ticket from Dec
2004.



   Report SVN Revision in parrotbug?
Jerry Gay resurrected an old ticket wondering whether to add a revision
field to RT tickets.



   Making Parrot Potable
Florian Ragwitz was having trouble drinking Parrot so he wants to expend
some effort to make it more potable. Apparently it does not get drunk so
well by many machines in debian's build farms and he would like to fix
it. When he asked how best to do his work (so as not to upset to many),
Chip suggested a local SVK mirror. Hopefully after he is done even more
people will be able to enjoy drinking the Parrot kool-aid.



   pbc_merge Requires LINK_DYNAMIC
Nick Glencross provided a patch fixing pbc_merge on HP-UX. François
Perrad noted that it was also problem on Win32. Jonathan Worthington
explained that he was aware of the problem and that the dependency on
the dynamic libraries would soon be removed.



   Compilable Option
Will Coleda wants a -c option which will only tell you if the code is
compilable for Parrot.



   Clerihewsiwhatsit?
Inspired by Piers's inspiration from his name, Roger Browne wrote a
Clerihew. Piers and Roger scare me.



   Debug Segments
There was much discussion about what sort of interface to expose to HLL
for debug segments. It looks like something good will come out of it
all.



   "Amber for Parrot" version 0.3.1
Announced, Roger Browne displaying Amber 0.3.1 aroun': this latest
version, magic cookie, is more than just a rookie.



   t/library/streams.t Failing
Patrick is still having trouble with t/library/streams.t. It sounds like
he would appreciate help, but Warnock applies.



   PGE::glob Issues
Will Coleda spotted a problem with PGE::Glob. Patrick fixed it.



   " find_word_boundary " Unneeded
Patrick posted his explanation of why find_word_boundary was an unneeded
opcode. Too that end he posted a patch updating t/op/string_cs.t.
Warnock applies to both thoughts.





   Coroutines Trample Scratchpads
Nick Glencross noted that coroutine_3.pasm was trampling some memory.
Leo said that scratchpads were on their way out. Nick wondered if the
ticket should be closed now, or when this is fixed. I vote that we not
close tickets until the problem is gone, but Warnock applies.



   MD5 Broken
Chip noticed that MD5 was horribly broken recently. He decided that
parrot should avoid it in favor of the SHA-2 family and maybe Whirlpool.
If you are a crypto dork, you have your job cut out for you.



Joshua Hoblitt joyously closed an RT ticket about removing $(MAKE_C).



   inconsistent dll linkage

Re: apo5

2005-11-21 Thread Ruud H.G. van Tol
Patrick R. Michaud:
> Ruud H.G. van Tol:
>> Patrick R. Michaud:
>>> Ruud H.G. van Tol:

 's/$/foo/' becomes 's//foo/'
>>>
>>> Uh, no, because  is still a zero width assertion.  :-)
>>
>> That's why I chose it. It is not at the end-of-string?
>
> Because ".*" matches "", // would be true at
> every position in the string, including the beginning,
> and this is where "foo" would be substituted.

I expected greediness, also because  could behave non-greedy.

Just like:
   s/(.*)/$1foo/
   s/(.*?)/$1foo/

OK, so 's//foo/' it must be.

But why does  behave non-greedy?

-- 
Grtz, Ruud



Re: apo5

2005-11-21 Thread Juerd
Larry Wall skribis 2005-11-21 12:08 (-0800):
> Unfortunately, though,
> 
> would be ambiguous, and/or wrong. 

Well, we could of course change "-" to mean "-1 or fewer", as "+" means
"+1 or more"... :D


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


Re: Hashing: avoid MD5 and SHA-1; use SHA-2 or Whirlpool

2005-11-21 Thread Chip Salzenberg
On Wed, Nov 16, 2005 at 05:26:05PM -0800, Brent 'Dax' Royal-Gordon wrote:
> My understanding is that the best attack on SHA-1 which can find two
> plaintexts with the same hash value in 2^63 operations.
> [...]
> Basically, SHA-1 isn't a problem for us yet, but it's looking weak.

OK.  Thanks for quantifying that, I'd missed the numbers.

> > I'm getting the feeling that the real lesson is that any hash header
> > system we build will require pluggable hash algorithms [...]
> 
> From what I've read, that was one of the conclusions of NIST's recent
> hash workshop.

I think we've been reading the same blog.  :-)
-- 
Chip Salzenberg <[EMAIL PROTECTED]>


Re: apo5

2005-11-21 Thread Ruud H.G. van Tol
Patrick R. Michaud:

>> 's/$/foo/' becomes 's//foo/'
>> 
> 
> Uh, no, because  is still a zero width assertion.  :-)


That's why I chose it. It is not at the end-of-string?

  perl5 -e '$_="abc"; s/(?<=...)/x/; print'

  perl5 -e '$_="abc"; s/(?!.)/x/; print'

  's//foo/'

-- 
Grtz, Ruud


Re: statement_control() (was Re: lvalue reverse and array views)

2005-11-21 Thread Larry Wall
On Mon, Nov 21, 2005 at 11:43:21AM -0800, Larry Wall wrote:
: Let's see, where did I put my stash of generic quotes?

I would like to publicly apologize for my remarks, which were far too
harsh for the circumstances.  I can only plead that I was trying to
be far too clever, and not thinking about how it would come across.
No, to be perfectly honest, it was more culpable than that.  I had
a niggling feeling I was being naughty, and I ignored it.  Shame on me.
I will try to pay better attention to my conscience in the future.

Larry


Re: apo5

2005-11-21 Thread Ruud H.G. van Tol
Larry Wall:

> in one of the updates, it says:
>
> +[Update: Actually, that's now written C<< <+alpha+digit> >>,
> avoiding +the mistaken impression entirely.]

In dev's A05.html I only found:
"[Update: That must now be written <++>, or it will be
mistaken for «alpha> looks right to me.
Idem <[_] | alpha | digit & !Swedish>, with left-to-right application.
(I don't oversee the consequences.)

-- 
Grtz, Ruud (sober.u is on the loose)



Ponie Inquiry

2005-11-21 Thread Joshua Gatcomb
All:
Back in the summer of 2003, Fotango offered financial support for Ponie
development for 2 years. Nicholas took up the development hat after Arthur,
but things are awfully quiet. Since summer 2005 has come and gone, I wonder
if funding has been extended. I know that Nicholas opened up the repository
to the public http://use.perl.org/~nicholas/journal/24649 but is anyone else
working on the project? With the excitement of Perl6, Parrot, and Pugs I
wonder if Ponie is being neglected.

Inquiring minds want to know.

Cheers,
Joshua Gatcomb
a.k.a. Limbic~Region


Re: apo5

2005-11-21 Thread Larry Wall
On Mon, Nov 21, 2005 at 07:57:59PM +0100, Ruud H.G. van Tol wrote:
: There is a "[[:alpha:][:digit:]" and a "[[:alpha:][:digit]]" on the
: A5-page.

Hmm, well, thanks--I went to fix it and I see Patrick beat me to
the fix.  But in one of the updates, it says:

+[Update: Actually, that's now written C<< <+alpha+digit> >>, avoiding
+the mistaken impression entirely.]

And it occurs to me that we could probably allow  there
since there's no ambiguity what 

that would fail under the current policy of treating "+ digit" as rule,
since you can't start a rule with +.

Unfortunately, though,



would be ambiguous, and/or wrong.  Could allow whitespace there if we
picked an explicit "this is rule" character.  Did we remove "this is
string"?  If so, we could swipe the colon:



Could put back "this is string" with explicit quotes:



but that doesn't save much over



which is partly why we removed "this is string" in the first place.

Larry


Re: RESPONSIBLE_PARTIES

2005-11-21 Thread Joshua Hoblitt
On Mon, Nov 21, 2005 at 11:51:49AM +0100, Leopold Toetsch wrote:
> 
> On Nov 20, 2005, at 22:09, Joshua Hoblitt wrote:
> 
> >I've like to nominate Jerry for an entry in RESPONSIBLE_PARTIES as the
> >test suite maintainer.  Thanks for all your work Jerry.
> >
> >Any objections?
> 
> Not at all - more the opposite ;-)

Committed as r10128. ;)

-J

--


pgptwxUy04t9v.pgp
Description: PGP signature


Re: statement_control() (was Re: lvalue reverse and array views)

2005-11-21 Thread Larry Wall
On Mon, Nov 21, 2005 at 02:05:31PM -0500, Rob Kinyon wrote:
: This is very close to a proposal I made to the ruby-dev mailing list
: (which was Warnocked). I proposed a very basic engine that would work
: with the parser/lexer to determine what action to take instead of
: using the huge case statements that are the heart of both P5 and Ruby.
: It would look something like:
: 
: TOKEN:
: while ( my $token = get_next_token() ) {
: for my $length ( reverse length($token) .. 1 ) {
: if ( my $actions = find_actions( substr( $token, 0, $length ) ) ) {
: $action->[-1]->( < params, if necessary > );
: }
: next TOKEN;
: }
: throw SyntaxError;
: }
: 
: The for-loop + substr() would be to handle longest-token-first rules.
: So, "..." is correctly recognized instead of handled as ".." and ".".
: The key would be that the $actions arrayref would get push'ed/pop'ed
: as you enter/leave a given lexical scope.
: 
: Obviously, this could be optimized to an extremely large degree, but
: it -should- work.

Let's see, where did I put my stash of generic quotes?  Ah, there is is.

"Those who do not understand XXX are doomed to reinvent it, poorly."
~~ s/XXX/the Perl 6 grammar engine/;

In particular, you've just reinvented the magic hash semantics mandated
by P6 rules, except that P6 has some hope of optimizing the lookups
to a cached trie or whatever.

Larry


Re: RESPONSIBLE_PARTIES

2005-11-21 Thread Joshua Hoblitt
On Mon, Nov 21, 2005 at 10:53:53AM +, Michael Lacey wrote:
> "RESPONSIBLE" - is that like "BLAMEABLE"? *smile*
>   Mike

Perhaps a better name for the file would be ENTITIES_AT_FAULT. ;)

-J

--


pgpc2X54zgxoq.pgp
Description: PGP signature


Re: Multidimensional argument list binding (*@;foo)

2005-11-21 Thread Larry Wall
On Mon, Nov 21, 2005 at 07:49:16PM +0100, Ingo Blechschmidt wrote:
: Aha! FYI, I got that interpretation from r6628 of S09 [1]:
: > The following two constructs are structurally indistinguishable:
: > 
: > (0..10; 1,2,4; 3)
: > ([0..10], [1,2,3,4], [3])

Sorry, started revising that one a couple days ago and got sidetracked...

Larry


Re: Multidimensional argument list binding (*@;foo)

2005-11-21 Thread Larry Wall
On Mon, Nov 21, 2005 at 03:48:30PM +, Luke Palmer wrote:
: To illustrate:
: 
: sub foo ([EMAIL PROTECTED]) {
: say [EMAIL PROTECTED];
: }
: sub bar (*@;a) {
: say +@;a;
: }
: foo(1,2,3; 4,5,6);   # 6
: bar(1,2,3; 4,5,6);   # 2
: 
: That is, the regular [EMAIL PROTECTED] has "concat" semantics.  However, I'd 
like to
: argue that it should have "die" semantics, for obvious reasons.

Well, that can be argued both ways.  The Unix shells get along very well
with default concat semantics, thank you:

(echo foo; echo bar; echo baz) | grep a

And it's rather Perlish to give you a level of flattening for free when
it comes to lists.  And I'd like to be able to distinguish:

my @foo := gather {
for @whatever {
take .generate();
}
}

from

my @;foo := gather {
for @whatever {
take .generate();
}
}

though I think maybe I'm arguing that the ; there is just documentation
if @;foo and @foo are really the same variable, and it's the differing
usage in rvalue context that desugars @;foo to [;]foo.dims.

Larry


Re: Multidimensional argument list binding (*@;foo)

2005-11-21 Thread Larry Wall
On Sun, Nov 20, 2005 at 09:11:33PM +0100, Ingo Blechschmidt wrote:
: Also, is specifying other, non-slurpy arguments prior to a slurpy
: @;multidim_arglist legal?

Yes, though we have to be careful about what happens when we bind the
entire first dimension and then get a <== boundary.  That's probably
not intended to produce an empty first dimension.  On the other hand,
maybe it just falls out of the policy that .[] is a null dimension
unless you actually put something there, and maye that extends to
.[;stuff] and even .[stuff].

: E.g.:
: 
: sub bar ($normal_var, @;AoA) {...}
: bar 42, @array1, @array2;
: # $normal_var is 42,
: # @AoAis ([EMAIL PROTECTED], [EMAIL PROTECTED])
: # Correct?

No, these are specifically not AoA.

: The existence of a @array variable does not imply the existence of a
: @;array variable, right?

I think it probably does, or should.  @;array is sugar for something
like [;[EMAIL PROTECTED], presuming that .specs clumps its slices into
single iterators (which it doesn't), and also presuming that you could
use [;] in a declarative context (which you can't).  So it's more like
[;[EMAIL PROTECTED], which is an array of slice generators each of which 
is a sublist of iterators.  It's probably not an array of arrays
internally, but just a list of specs with some of them marked as starting
a new dimension.

We originally were modeling the multidimension stuff on AoA, but we
kept getting tangled up in intentional vs unintentional brackets.
We need to be able to support the userland flat view of an array and
still be able to get at its specs anyway, so this is basically trying
to handle multislices/multidims/multipipes with the same underlying
mechanism.

Larry


Re: test suite refactoring

2005-11-21 Thread chromatic
On Sat, 2005-11-19 at 21:05 +0100, Bernhard Schmalhofer wrote:

> Setting the Perl5 search path can be handled with FindBin. See for 
> example languages/m4/t/basic/001_comletely_empty.t:
> 
>   use FindBin;
>   use lib "$FindBin::Bin/../../lib", "$FindBin::Bin/../../../../lib";

That's fairly ugly to put in the header of every test file.  It would be
nice to avoid such repeated, typo-prone, scary black magic.

-- c



Re: statement_control() (was Re: lvalue reverse and array views)

2005-11-21 Thread Rob Kinyon
On 11/21/05, TSa <[EMAIL PROTECTED]> wrote:
> HaloO,
>
> Luke Palmer wrote:
> > On 11/21/05, Ingo Blechschmidt <[EMAIL PROTECTED]> wrote:
> >
> >>Of course, the compiler is free to optimize these things if it can prove
> >>that runtime's &statement_control: is the same as the internal
> >>optimized &statement_control:.
> >
> >
> > Which it definitely can't without some pragma.
>
> Isn't the question just 'when'? I think at the latest it could be
> optimized JIT before the first execution, or so. The relevant AST
> branch stays for later eval calls which in turn branch off the
> sourrounding module's version from within the running system such
> that the scope calling the eval sees the new version. And this in
> turn might be optimzed and found unchanged in its optimized form.
>
> Sort of code morphing of really first class code. Everything else
> makes closures second class ;)

This is very close to a proposal I made to the ruby-dev mailing list
(which was Warnocked). I proposed a very basic engine that would work
with the parser/lexer to determine what action to take instead of
using the huge case statements that are the heart of both P5 and Ruby.
It would look something like:

TOKEN:
while ( my $token = get_next_token() ) {
for my $length ( reverse length($token) .. 1 ) {
if ( my $actions = find_actions( substr( $token, 0, $length ) ) ) {
$action->[-1]->( < params, if necessary > );
}
next TOKEN;
}
throw SyntaxError;
}

The for-loop + substr() would be to handle longest-token-first rules.
So, "..." is correctly recognized instead of handled as ".." and ".".
The key would be that the $actions arrayref would get push'ed/pop'ed
as you enter/leave a given lexical scope.

Obviously, this could be optimized to an extremely large degree, but
it -should- work.

Rob


Re: apo5

2005-11-21 Thread Ruud H.G. van Tol
Larry Wall:
> Ruud H.G. van Tol:


> dev.perl.org one day latency but html-ified
> svn.perl.org up to the minute but only in pod

Thanks, much better. Can't say that I haven't been there before.

There is a "[[:alpha:][:digit:]" and a "[[:alpha:][:digit]]" on the
A5-page.


>> The '^' could be used for both the ultimate start- and end-of-string.
>> This frees the '$'.
>
> I think this is one of those aspects of regex culture that is too
> entrenched to remove.

Yes, I have experienced that with some of my procmail-recipes that use
'^' to match embedded newlines.
In procmail the '^^' matches begin- or end-of-string. Both a '^' and a
'$' can be used to match a real or putative newline. Some people
replaced my '^'s with '$'s.

OK, everybody can stop reading here, no serious attempts below.

"Within C++, there is a much smaller and cleaner language struggling to
get out," which "would ... have been an unimportant cult language."
(Bjarne Stroustrup, The Design and Evolution of C++).


> Besides, you have to be able to distinguish
> s/^/foo/ from s/$/foo/.

's/$/foo/' becomes 's//foo/'



>> There is still the '$$' that matches before embedded newlines, and
>> since '^^' matches after those newlines, the '^^' and '$$' can only
>> be unified to '^^' if it is one-width inside a string, so is like
>> '[$$\n^^]' (or just '\n') there.
>
> But then if you use it within a capture, you get an extra newline you
> probably don't want.

Place the ^^ outside the ().

I wasn't sure about the default for the greediness of '^^' at begin- or
end-of-string, I guess non-greediness can be arranged with a trailing
'?'.


>> At start- and end-of-string the '^^' can still be a zero-width match.
>> I am not sure about greedy (meaning to try one-width first) or
>> non-greedy.
>>
>> Example: '^[(\N*)^^]*^' to capture all lines, clean of newlines.
>> Not a lot clearer than '^[(\N*)\n*]*$', but freeing the '$' and '$$'
>> might be worth it.
>
> I don't think it's any clearer.

Pardon my Dutch, I didn't find it clearer either ("but, might be worth
it").


> In fact, I find all the ^'s there
> are a little too visually confusing and contextual.

/^  # BoS
   [# start of non-capturing group
 (\N*)  # capture a substring of non-newlines
 ^^ # newline or EoS
   ]*   # end of non-capturing group, repeat
 ^/x# EoS

As I just said, I am used to '^^' as start- and end-of-buffer, and '^'
as matching a real or putative newline, because of procmail.

-- 
Grtz, Ruud



Re: Multidimensional argument list binding (*@;foo)

2005-11-21 Thread Ingo Blechschmidt
Hi,

Luke Palmer wrote:
> On 11/21/05, Ingo Blechschmidt <[EMAIL PROTECTED]> wrote:
>> Hm. How is (*@;AoA) different from (Array [EMAIL PROTECTED]) then? (Assuming 
>> that
>> foo(@a; @b) desugars to foo([EMAIL PROTECTED], [EMAIL PROTECTED]).)
> 
> Well, it's not at all, under that assumption.  But that assumption is
> wrong.

Aha! FYI, I got that interpretation from r6628 of S09 [1]:
> The following two constructs are structurally indistinguishable:
> 
> (0..10; 1,2,4; 3)
> ([0..10], [1,2,3,4], [3])

> I think foo(@a; @b) doesn't have a sugar-free form (that is to
> say, it is the sugar-free form).  Among things that desugar to it:
> 
> @a ==> foo() <== @b
> foo(@a) <== @b
> @a ==> @b ==> foo()   # maybe; don't remember
> 
> To illustrate:
> 
> sub foo ([EMAIL PROTECTED]) {
> say [EMAIL PROTECTED];
> }
> sub bar (*@;a) {
> say +@;a;
> }
> foo(1,2,3; 4,5,6);   # 6
> bar(1,2,3; 4,5,6);   # 2
> 
> That is, the regular [EMAIL PROTECTED] has "concat" semantics.  However, I'd 
> like to
> argue that it should have "die" semantics, for obvious reasons.

Just to clarify -- only ";" with "*@;a" should have "die" semantics, ","
with "*@;a" should continue to work, right? (If so, I agree.)

Could you provide some more examples with ;, please? In particular, what
are the results of the following expressions?

(42; 23)
(@a; @b)
(@a; @b)[0]
(@a; @b)[0][0]

((42;23); (17;19))
((@a;@b); (@c;@d))

*(42; 23)
*(@a; @b)

( (42; 23), 19)
(*(42; 23), 19)

[42; 23]
[EMAIL PROTECTED]; @b]


Thanks very much,

--Ingo

[1] http://svn.perl.org/perl6/doc/trunk/design/syn/S09.pod
/The semicolon operator



Re: statement_control() (was Re: lvalue reverse and array views)

2005-11-21 Thread Larry Wall
On Mon, Nov 21, 2005 at 10:45:56AM -0800, Larry Wall wrote:
: Another issue in "if" optimization is whether the blocks in fact do
: anything blockish that have to be scoped to the block.  This is a
: determination that Perl 5 makes when it's compiling blocks.  It's
: basically an attribute that migrates up the tree from the leaves, which
: are mostly "true", but anyone in the block can falsify the attribute
: for the block as a whole.

Actually, I said that backwards.  It starts out false and gets truified
if anyone says "Yes, we gotta have a block around us."

Larry


Re: statement_control() (was Re: lvalue reverse and array views)

2005-11-21 Thread Larry Wall
On Mon, Nov 21, 2005 at 03:51:19PM +, Luke Palmer wrote:
: On 11/21/05, Ingo Blechschmidt <[EMAIL PROTECTED]> wrote:
: > Of course, the compiler is free to optimize these things if it can prove
: > that runtime's &statement_control: is the same as the internal
: > optimized &statement_control:.
: 
: Which it definitely can't without some pragma.

But remember that on some level or other, all declarations function as
pragmas.  So the absence of a redeclaration of "if" could be taken as
a kind of pragma, if we require control redefinition to be lexically
scoped, which we probably should.

: I wonder if they should be macros.  (Macros that would by default
: expand to things that aren't expressible in Perl 6)

Which is another way of saying that control redefinitions should be
lexically scoped, since macros are required to do lexically scoped
syntax modification unless they're Preluditudinous.

Another issue in "if" optimization is whether the blocks in fact do
anything blockish that have to be scoped to the block.  This is a
determination that Perl 5 makes when it's compiling blocks.  It's
basically an attribute that migrates up the tree from the leaves, which
are mostly "true", but anyone in the block can falsify the attribute
for the block as a whole.

Arguably, when you use ??!! and friends, it should also be doing such
analysis on the lazy bits and telling you that your "my" is badly
scoped if it's in conditional code.  That also catches

my $x = 0 if rand 2;

Larry


Re: statement_control() (was Re: lvalue reverse and array views)

2005-11-21 Thread TSa

HaloO,

Luke Palmer wrote:

On 11/21/05, Ingo Blechschmidt <[EMAIL PROTECTED]> wrote:


Of course, the compiler is free to optimize these things if it can prove
that runtime's &statement_control: is the same as the internal
optimized &statement_control:.



Which it definitely can't without some pragma.


Isn't the question just 'when'? I think at the latest it could be
optimized JIT before the first execution, or so. The relevant AST
branch stays for later eval calls which in turn branch off the
sourrounding module's version from within the running system such
that the scope calling the eval sees the new version. And this in
turn might be optimzed and found unchanged in its optimized form.

Sort of code morphing of really first class code. Everything else
makes closures second class ;)
--


Re: \x{123a 123b 123c}

2005-11-21 Thread Larry Wall
On Mon, Nov 21, 2005 at 09:02:57AM -0800, Larry Wall wrote:
: But I'd like to reserve < > for delimiting what is returned by $<>,
: the string officially matched:
: 
: "foo bar baz" ~~ /:w foo < \w+ > baz/
: say $/;   # foo bar baz
: say $<>;  # bar

Though it occurs to me that there's another possible interpretation,
culturally speaking.  The overloading of \b has always bothered me,
plus the fact that \b can't distinguish which kind of word boundary
without additional context.  In regex culture, we have the \<...\>
word matcher, and maybe that devolves to isolated < ... > in rules.

We could still use << ... >> to capture $<>, which I was leaning toward
anyway just for visibility reasons, since the two ends could be quite
far apart.

And file globbing could just be :glob or some such if we really need
to embed it in rules.

Larry


Re: apo5 (was: Re: \x{123a 123b 123c})

2005-11-21 Thread Larry Wall
On Mon, Nov 21, 2005 at 05:49:59PM +0100, Ruud H.G. van Tol wrote:
: Larry Wall:
: > Juerd:
: >> Ruud:
: 
: >>> Maybe
: >>> "\x{123a 123b 123c}"
: >>> is a nice alternative of
: >>> "\x{123a} \x{123b} \x{123c}".
: >>
: >> Hmm, very cute and friendly! Can we keep it, please? Please?
: 
: Thanks for the support.

Hey, this ain't exactly a popularity contest here...  :-)

: > We already have, from A5, \x[0a;0d], so you can supposedly say
: > "\x[123a;123b;123c]"
: 
: 
: Found it in the old/new table on page 7. For me the semicolon is fine.

The fact that you say "page 7" leads me to guess that you're reading
it from perl.com.  That's going to be the most out-of-date version.
Better would be

dev.perl.orgone day latency but html-ified
svn.perl.orgup to the minute but only in pod

In particular, the Apocalypses have little [Update:] sections that are
supposed to alert you to things that have changed since the the Apo
was written.  (Though some of those are a little out of date right now
too--I'm just working my way through A12 again.)

: I am using character names more and more, and between those, semicolons
: are less cluttery. Character names can contain spaces, but semicolons
: too? If not then
: \c[BEL; EXTENDED ARABIC-INDIC DIGIT ZERO] would be possible, but maybe
: better not, or more like
: \c['BEL'; 'EXTENDED ARABIC-INDIC DIGIT ZERO'] or even
: \c('BEL', 'EXTENDED ARABIC-INDIC DIGIT ZERO').

None of the current names contain either semicolon or comma, so I expect
they're avoiding them by policy.

: Something else:
: The '^' could be used for both the ultimate start- and end-of-string.
: This frees the '$'.

I think this is one of those aspects of regex culture that is too
entrenched to remove.  Besides, you have to be able to distinguish
s/^/foo/ from s/$/foo/.

: There is still the '$$' that matches before embedded newlines, and since
: '^^' matches after those newlines, the '^^' and '$$' can only be unified
: to '^^' if it is one-width inside a string, so is like '[$$\n^^]' (or
: just '\n') there.

But then if you use it within a capture, you get an extra newline you
probably don't want.

: At start- and end-of-string the '^^' can still be a zero-width match.
: I am not sure about greedy (meaning to try one-width first) or
: non-greedy.
: 
: Example: '^[(\N*)^^]*^' to capture all lines, clean of newlines.
: Not a lot clearer than '^[(\N*)\n*]*$', but freeing the '$' and '$$'
: might be worth it.

I don't think it's any clearer.  In fact, I find all the ^'s there
are a little too visually confusing and contextual.

Larry


Re: test suite refactoring

2005-11-21 Thread jerry gay
On 11/20/05, jerry gay <[EMAIL PROTECTED]> wrote:
> for now, i've reorganized the pge tests, moving them into the
> t/compilers/pge/ directory and subdirs, in revision 10112. smoke tests
> and bug reports are welcome for all platforms. in testing, i've come
> across that may affect msvc6 on win32 (no problem with msvc7.) i'm
> hunting this down, and once fixed, i'll move on to tge, imcc, etc.
>
test suite reorganization is ongoing, now that i've fixed the makefile
bug causing msvc6 to fail on win32. as of revision 10123, TGE's tests
have been moved to t/compilers/tge/. in addition, tge has been added
to the root makefile, causing it to be built and tested by default.

while i continue to investigate imcc's tests, i'll also be looking at
test file order in the main makefile, and standardization of test file
headers (use warnings; use Test::More; etc.)
~jerry


Re: \x{123a 123b 123c}

2005-11-21 Thread Patrick R. Michaud
On Mon, Nov 21, 2005 at 03:23:35PM +0100, TSa wrote:
> Patrick R. Michaud wrote:
> >There's also , unless someone redefines the  subrule.
> >And in the general case that's a slightly more expensive mechanism 
> >to get a space (it involves at least a subrule lookup).  Perhaps 
> >we could also create a visible meta sequence for it, in the same 
> >way that we have visible metas for \e, \f, \r, \t.  But I have 
> >no idea what letter we might use there.
> 
> How about \x and \X respectively? Note the *space* after it :)
> ...

If we're going to do that, I'd think it would be "\c " and "\C " 
instead of "\x " and "\X ".  I'm not really advocating this,
I'm just commenting that in this case \c seems more natural 
than \x.

Pm


Re: \x{123a 123b 123c}

2005-11-21 Thread Larry Wall
On Sun, Nov 20, 2005 at 10:27:17AM -0600, Patrick R. Michaud wrote:
: On Sat, Nov 19, 2005 at 06:32:17PM -0800, Larry Wall wrote:
: > On Sun, Nov 20, 2005 at 01:26:21AM +0100, Juerd wrote:
: > : Ruud H.G. van Tol skribis 2005-11-20  1:19 (+0100):
: > : > Maybe 
: > : > "\x{123a 123b 123c}" 
: > : > is a nice alternative of 
: > : > "\x{123a} \x{123b} \x{123c}". 
: > 
: > We already have, from A5, \x[0a;0d], so you can supposedly say 
: > "\x[123a;123b;123c]" 
: 
: Hmm, I hadn't caught that particular syntax in A05.  AFAIK it's not 
: in S05, so I should probably add it, or whatever syntax we end up 
: adopting.

Yes.

: (BTW, we haven't announced it on p6l yet, but there's a new version of
: S05 available.)

Indeed, there are new versions of most of the S's.  People who want the
latest should use svn.perl.org, which also makes it easy to do diff listings
with svn or svk.

: > [...]
: > But I see that the semicolon is rather cluttery, mainly because it's
: > too tall.  I'm not sure going all the way to space is good, but we
: > might have
: > "\x[123a,123b,123c]" 
: > just to get a little visual space along with the separator.  
: 
: Just to verify, with this syntax would we expect
: 
: \x[123a,123b,123c]+
: 
: to be the same as
: 
: [\x123a \x123b \x123c]+
: 
: and not "\x123a \x123b \x123c+" ?

Yes.  I think the rule interpretation of \x is that it is a sequence to
be considered a single character regardless of its context.  Certainly
the square brackets we've mandated would tend to read as grouping anyway.

Of course, the main point of the \x[a,b,c] notation is to allow
interpolation of sequences of hex characters into ordinary strings,
and those don't care about abstract character boundaries.

: > It occurs to me that we didn't spec whether character classes ignore
: > whitespace.  They probably should, just so you can chunk things:
: > 
: > / <[ a..z A..Z 0..9 _ ]> /
: > 
: > Then the question arises about whether <[ \ ]> is an escaped space
: > or a backslash, or illegal  
: 
: I vote that it's an escaped space.  A backslash is nearly always \\
: (or should be imho).
: 
: > But if we make it match a backslash
: > or illegal, then the minimal space matcher becomes \x20, I think,
: > unless you graduate to \s.  On the other hand, if we make it match
: > a space, people aren't going to read that way unless they're pretty
: > sophisticated...
: 
: There's also , unless someone redefines the  subrule.

But you can't use  in a character class.  Well, that is, unless
you write it:

<+[ a..z ]+>

or some such.  Maybe that's good enough.

: And in the general case that's a slightly more expensive mechanism 
: to get a space (it involves at least a subrule lookup).  Perhaps 
: we could also create a visible meta sequence for it, in the same 
: way that we have visible metas for \e, \f, \r, \t.  But I have 
: no idea what letter we might use there.

Something to be said for \_ in that regard.

: I don't think I like this, but perhaps  C<< <> >> becomes  
: and C<< < > >> becomes <' '>?  Seems like not enough visual distinction
: there...

<_> maybe.  I'm good with <> being , and <,> being element boundary
when matching lists.  But I'd like to reserve < > for delimiting what
is returned by $<>, the string officially matched:

"foo bar baz" ~~ /:w foo < \w+ > baz/
say $/; # foo bar baz
say $<>;# bar

Or possibly

"foo bar baz" ~~ /:w foo << \w+ >> baz/

but that should probably mean whatever

"foo bar baz" ~~ /:w foo « \w+ » baz/

eventually means.  Which I haven't the foggiest.  But we should probably
reserve the brackets on general principle's sake, just because brackets
are so scarce.

I dunno.  If «...» in ordinary code does shell quoting, maybe «...» in
rules does filename globbing or some such.  I can see some issues with
anchoring semantics.  Makes more sense on a string as a whole, but maybe
can anchor on element boundaries if used on a list of filenames.
I suppose one could even go as far as

rule jpeg :i « *.jp{e,}g »

or whatever the right glob syntax is.

Larry


apo5 (was: Re: \x{123a 123b 123c})

2005-11-21 Thread Ruud H.G. van Tol
Larry Wall:
> Juerd:
>> Ruud:

>>> Maybe
>>> "\x{123a 123b 123c}"
>>> is a nice alternative of
>>> "\x{123a} \x{123b} \x{123c}".
>>
>> Hmm, very cute and friendly! Can we keep it, please? Please?

Thanks for the support.


> We already have, from A5, \x[0a;0d], so you can supposedly say
> "\x[123a;123b;123c]"


Found it in the old/new table on page 7. For me the semicolon is fine.

I am using character names more and more, and between those, semicolons
are less cluttery. Character names can contain spaces, but semicolons
too? If not then
\c[BEL; EXTENDED ARABIC-INDIC DIGIT ZERO] would be possible, but maybe
better not, or more like
\c['BEL'; 'EXTENDED ARABIC-INDIC DIGIT ZERO'] or even
\c('BEL', 'EXTENDED ARABIC-INDIC DIGIT ZERO').



Something else:
The '^' could be used for both the ultimate start- and end-of-string.
This frees the '$'.

There is still the '$$' that matches before embedded newlines, and since
'^^' matches after those newlines, the '^^' and '$$' can only be unified
to '^^' if it is one-width inside a string, so is like '[$$\n^^]' (or
just '\n') there.
At start- and end-of-string the '^^' can still be a zero-width match.
I am not sure about greedy (meaning to try one-width first) or
non-greedy.

Example: '^[(\N*)^^]*^' to capture all lines, clean of newlines.
Not a lot clearer than '^[(\N*)\n*]*$', but freeing the '$' and '$$'
might be worth it.



-- 
Affijn, Ruud

"Gewoon is een tijger."



Re: statement_control() (was Re: lvalue reverse and array views)

2005-11-21 Thread Luke Palmer
On 11/21/05, Ingo Blechschmidt <[EMAIL PROTECTED]> wrote:
> Of course, the compiler is free to optimize these things if it can prove
> that runtime's &statement_control: is the same as the internal
> optimized &statement_control:.

Which it definitely can't without some pragma.

I wonder if they should be macros.  (Macros that would by default
expand to things that aren't expressible in Perl 6)

Luke


Re: Multidimensional argument list binding (*@;foo)

2005-11-21 Thread Luke Palmer
On 11/21/05, Ingo Blechschmidt <[EMAIL PROTECTED]> wrote:
> Hm. How is (*@;AoA) different from (Array [EMAIL PROTECTED]) then? (Assuming 
> that
> foo(@a; @b) desugars to foo([EMAIL PROTECTED], [EMAIL PROTECTED]).)

Well, it's not at all, under that assumption.  But that assumption is
wrong.  I think foo(@a; @b) doesn't have a sugar-free form (that is to
say, it is the sugar-free form).  Among things that desugar to it:

@a ==> foo() <== @b
foo(@a) <== @b
@a ==> @b ==> foo()   # maybe; don't remember

To illustrate:

sub foo ([EMAIL PROTECTED]) {
say [EMAIL PROTECTED];
}
sub bar (*@;a) {
say +@;a;
}
foo(1,2,3; 4,5,6);   # 6
bar(1,2,3; 4,5,6);   # 2

That is, the regular [EMAIL PROTECTED] has "concat" semantics.  However, I'd 
like to
argue that it should have "die" semantics, for obvious reasons.

Luke


Re: till (the flipflop operator, formerly ..)

2005-11-21 Thread Ingo Blechschmidt
Hi,

Larry Wall wrote:
> On Sun, Nov 20, 2005 at 08:51:03PM +0100, Ingo Blechschmidt wrote:
> : according to the new S03, till is the new name for the flipflop
> : operator.
> 
> Presuming we can make it work out as an infix macro.

Ah, it's a macro. This clarifies things.

> : Do the flipflop operators of subroutines maintain own
> : per-invocation-of-the-sub states? I.e.:
> : 
> : sub foo (&x) { x() till 0 }
> : 
> : foo { 0 };  # evaluates to a false value, of course
> : 
> : foo { 1 };  # evaluates to a true value, of course
> : foo { 0 };
> : # still true?
> : #   (Argumentation: The flipflop is in the "true" state,
> : #   so the LHS is not evaluated.)
> : # Or is it false?
> : #   (Argumentation: The flipflop operator of the previous
> : #   invocation is not the flipflop operator of the current
> : #   invocation, so the return value is false.)
> 
> It's still true.  Ignoring the "E0" issue, the desugar of "A till B"
> is something like:
[...]

Thanks very much, this code is very clear. :)

> : Also, all operators can be called using the subroutine form (which
> : is a very good thing), e.g.:
> : 
> : say infix:<->(42, 19);  # 23
> : 
> : Is this true for till as well?
> : 
> : say infix:(LHS, RHS);
> 
> Probably not.  Calling macros as functions is a bit of a problem.

Yep. (I assumed &infix: would be an ordinary subroutine.)

> : But how would &infix: maintain the state then, as no explicit
> : ID is passed to it? Does &infix: access an internal %states
> : hash, using $CALLER::POSITION as keys?
> 
> That feels like a hack to me.  I'd rather find a way of poking a real
> state variable into the caller's scope if we have to support that.

Agreed. The desugar you provides feels far more sane.

> : Perl 5's flipflop operator appends "E0" to the final sequence number
> : in a range, allowing searches for /E/. My guess is that this is
> : superseded by "$sequence_number but
> : this_is_the_endpoint_of_the_range" (you get the idea). Correct?
> 
> I was just thinking that you'd use till^ if you wanted to exclude the
> endpoint.  And ^till to exclude the beginning, and ^till^ to exclude
> both, just as with ..^, ^.., and ^..^.

Ok.

> In fact, that's really my main motivation for wanting it to be infix.
> Otherwise it might as well be an ordinary flipflip() macro, or
> fromto().

Makes sense.


--Ingo



Re: statement_control() (was Re: lvalue reverse and array views)

2005-11-21 Thread Ingo Blechschmidt
Hi,

Rob Kinyon wrote:
> On 11/20/05, Ingo Blechschmidt <[EMAIL PROTECTED]> wrote:
>> Yep. Also note that "for" is not a special magical construct in Perl
>> 6, it's a simple subroutine (&statement_control:, with the
>> signature ([EMAIL PROTECTED], Code *&code)). (Of course, it'll usually be
>> optimized.)
>>
>> Example:
>>
>> {
>> my sub statement_control: ([EMAIL PROTECTED], Code *&code) {
>> map &code, reverse @array;
>> }
>>
>> for  -> $item { say $item }
>> # "c\nb\na\n"
>> }
>>
>> # for restored, as the modified for went out of scope:
>> for  -> $item { say $item }
>> # "a\nb\nc\n"
> 
> Is there a list of the statement control items that are implemented as
> such vs. implemented in another way?

&statement_control:,
&statement_control:,
&statement_control:,
&statement_control:,
&statement_control:, and
&statement_control:

come to my mind.

??!! is proably defined as

sub ternary: ($cond, $then is lazy, $else is lazy) {
if $cond { $then } else { $else }
}

(Assuming that "ternary" is the correct grammatical category and "is
lazy" DWIMs.)

Of course, the compiler is free to optimize these things if it can prove
that runtime's &statement_control: is the same as the internal
optimized &statement_control:.


--Ingo



Re: Multidimensional argument list binding (*@;foo)

2005-11-21 Thread Ingo Blechschmidt
Hi,

Luke Palmer wrote:
> On 11/20/05, Ingo Blechschmidt <[EMAIL PROTECTED]> wrote:
>> sub foo (*@;AoA) { @;AoA }
>>
>> my @array1 = ;
>> my @array2 = ;
>>
>> my @AoA = foo @array1, @array2;
>> say [EMAIL PROTECTED]; # 2?
> 
> 1
> 
>> say [EMAIL PROTECTED];  # a b c?
> 
> a b c d e f
> 
> However,
> 
> my @AoA = foo(@array1; @array2);
> # all of Ingo's predictions are now correct

Hm. How is (*@;AoA) different from (Array [EMAIL PROTECTED]) then? (Assuming 
that
foo(@a; @b) desugars to foo([EMAIL PROTECTED], [EMAIL PROTECTED]).)


--Ingo



Re: \x{123a 123b 123c}

2005-11-21 Thread TSa

HaloO,

Patrick R. Michaud wrote:

There's also , unless someone redefines the  subrule.
And in the general case that's a slightly more expensive mechanism 
to get a space (it involves at least a subrule lookup).  Perhaps 
we could also create a visible meta sequence for it, in the same 
way that we have visible metas for \e, \f, \r, \t.  But I have 
no idea what letter we might use there.


How about \x and \X respectively? Note the *space* after it :)
I mean that much more serious than it might sound err read.
I hope the concept of unwritten things in the source beeing
interesting values of void/undef applies always.

OTOH, I'm usually not saying anything in the area of the grammar
subsystem, but I still try to wrap my brain around the underlying
unifyed conceptual level where rules and methods or subs and macros
are indistinguishable. So, please consider this as a well wanting
question. And please forgive the syntax errors.

With something like

   # or token? perhaps even sub?
   macro   x ( HexLiteral *[$char = 32, [EMAIL PROTECTED] )
   is parsed( * )
   {...}

and \ in match strings escaping out to the macro level when
the circumfix match creator is invoked, I would expect

   m/  \x   /;  # single space is required
   m/  \x20 /;  # same
   m/ <{x}> /;  # same?
   m/  \X   /;  # any single char except space
   m/  \x\x\x   /;  # exactly three spaces
   m/  \x[20,20,20] /;  # same, as proposed by Larry
   m/  \xy  /;  # parse error 'y not a hex digit'
   m/  \x y /;  # one space then y

to insert verbatim, machine level chars into the match definition.
In particular *no* lookup is compiled in.

I would call \x the single character *exact* matcher and \X
the *excluder*. BTW, the definition of the latter could just be

   &X ::= !&x; # or automagically defined by up-casing and outer negation

if ? and ! play in the meta operator league.


I don't think I like this, but perhaps  C<< <> >> becomes  
and C<< < > >> becomes <' '>?  Seems like not enough visual distinction

there...


I strongly agree. I would ask the moot question *how* the single space
in / / is removed ---as leading, trailing or separating space---when the
parser goes over it. But I would never expect the source space to make it
into the compiled match code!
--


Re: Call frame introspection (was Re: PDD20 - Call frames as PMCs)

2005-11-21 Thread Nicholas Clark
On Tue, Nov 15, 2005 at 10:30:38AM -0800, Chip Salzenberg wrote:

> [*] an inode may have as few as zero or as many as USHORT_MAX[**] names,
> and finding them all requires scanning a disks's entire directory tree

Although one should note that you can loose valid names off the top of your
directory tree with chroot, and from the leaves by mounting another filing
system over a non-empty directory.

(But of course if you're wondering why your filing system is fuller than it
should be given the size of all the files, it's more likely to be open but
unlinked files rather than the above 2 causes).

Ooops. Digression.

Nicholas Clark


Re: RESPONSIBLE_PARTIES

2005-11-21 Thread Michael Lacey
"RESPONSIBLE" - is that like "BLAMEABLE"? *smile*
  Mike

 On 21/11/05, Leopold Toetsch <[EMAIL PROTECTED]> wrote:
>
>
> On Nov 20, 2005, at 22:09, Joshua Hoblitt wrote:
>
> > I've like to nominate Jerry for an entry in RESPONSIBLE_PARTIES as the
> > test suite maintainer. Thanks for all your work Jerry.
> >
> > Any objections?
>
> Not at all - more the opposite ;-)
>
> > -J
>
> leo
>
>


--
Mike Lacey
Project Manager
Partner Services
07717 458 268


Re: RESPONSIBLE_PARTIES

2005-11-21 Thread Leopold Toetsch


On Nov 20, 2005, at 22:09, Joshua Hoblitt wrote:


I've like to nominate Jerry for an entry in RESPONSIBLE_PARTIES as the
test suite maintainer.  Thanks for all your work Jerry.

Any objections?


Not at all - more the opposite ;-)


-J


leo