Re: on parrot strings

2002-01-18 Thread Piers Cawley
Hong Zhang <[EMAIL PROTECTED]> writes: >> > preprocessing. Another example, if I want to search for /resume/e, >> > (equivalent matching), the regex engine can normalize the case, fully >> > decompose input string, strip off any combining character, and do 8-bit >> >> Hmmm. The above sounds co

Re: A question

2002-01-18 Thread Piers Cawley
[reformatting response for readability and giving Glenn a stiff talking to] Glenn Linderman <[EMAIL PROTECTED]> writes: > Piers Cawley wrote: > >> Okay boys and girls, what does this print: >> >> my @aaa = qw/1 2 3/; >> my @bbb = @aaa; >> >> try { >> print "$_\n"; >> } >> >> for @aaa; @bbb ->

Re: Does this mean we get Ruby/CLU-style iterators?

2002-01-18 Thread Piers Cawley
Larry Wall <[EMAIL PROTECTED]> writes: > Michael G Schwern writes: > : Reading this in Apoc 4 > : > : sub mywhile ($keyword, &condition, &block) { > : my $l = $keyword.label; > : while (&condition()) { > : &block(); > : CATCH { > : my $

Parrot strings

2002-01-18 Thread Melvin Smith
Anyone have any objection to adding a couple of calls to terminate and/or return null terminated strings from Parrot strings for places where an API expects a standard C string? I'm not sure of the preferred way to handle this. It would be nice to at least try to terminate the current string buff

Re: Does this mean we get Ruby/CLU-style iterators?

2002-01-18 Thread Larry Wall
Piers Cawley writes: : Hmm... making up some syntax on the fly. I sort of like the idea of : being able to do : : class File; : sub foreach ($file, &block) is Control { : # 'is Control' declares this as a control sub, which, amongst : # other things 'hides' itself from cal

Re: Does this mean we get Ruby/CLU-style iterators?

2002-01-18 Thread Larry Wall
Michael G Schwern writes: : Reading this in Apoc 4 : : sub mywhile ($keyword, &condition, &block) { : my $l = $keyword.label; : while (&condition()) { : &block(); : CATCH { : my $t = $!.tag; : when X::Control::next { die

Re: Apo4: PRE, POST

2002-01-18 Thread Glenn Linderman
Me wrote: > > [concerns over conflation of post-processing and post-assertions] > > Having read A4 thoroughly, twice, this was my only real concern > (which contrasted with an overall sense of "wow, this is so cool"). > > --me Yes, very, very cool. I especially liked how RFC 88 was "accepted wi

Re: A question

2002-01-18 Thread Glenn Linderman
That particular example is flawed, because the try expression is turned into a try statement because the } stands alone on its line. But if you eliminate a couple newlines between } and for, then your question makes sense (but the code is not well structured, but hey, maybe you take out all the n

Re: on parrot strings

2002-01-18 Thread Jarkko Hietaniemi
On Fri, Jan 18, 2002 at 11:40:17PM +, Nicholas Clark wrote: > On Fri, Jan 18, 2002 at 05:24:00PM +0200, Jarkko Hietaniemi wrote: > > > > As for character encodings, we're forcing everything to UTF-32 in > > > regular expressions. No exceptions. If you use a string in a regex, > > > it'll be

A question

2002-01-18 Thread Piers Cawley
Okay boys and girls, what does this print: my @aaa = qw/1 2 3/; my @bbb = @aaa; try { print "$_\n"; } for @aaa; @bbb -> my $a; my $b { print "$a:$b"; } I'm guessing one of: 1:1 2:2 3:3 or a syntax error, complaining about something near C<@bbb -> my $a ; my $b {> In other words, how

Re: Apo4: PRE, POST

2002-01-18 Thread Piers Cawley
"Me" <[EMAIL PROTECTED]> writes: >> [concerns over conflation of post-processing and post-assertions] > > Having read A4 thoroughly, twice, this was my only real concern > (which contrasted with an overall sense of "wow, this is so cool"). I think that people have sort of got used to the fact th

Re: on parrot strings

2002-01-18 Thread Nicholas Clark
On Fri, Jan 18, 2002 at 05:24:00PM +0200, Jarkko Hietaniemi wrote: > > As for character encodings, we're forcing everything to UTF-32 in > > regular expressions. No exceptions. If you use a string in a regex, > > it'll be transcoded. I honestly can't think of a better way to > > guarantee effi

Benchmarking regexps against perl5

2002-01-18 Thread Nicholas Clark
A thought occurred to me a few days ago: If I remember correctly, attempts to benchmark parrot's developing regular expressions against perl's regular expressions are proving "disappointing". However, perl5 has the advantage of a regular expression optimiser as I understand it, or at least code t

Re: Does this mean we get Ruby/CLU-style iterators?

2002-01-18 Thread Piers Cawley
Dan Sugalski <[EMAIL PROTECTED]> writes: > At 3:37 PM + 1/18/02, Piers Cawley wrote: >>Michael G Schwern <[EMAIL PROTECTED]> writes: >> >>Hmm... making up some syntax on the fly. I sort of like the idea of >>being able to do >> >> class File; >> sub foreach ($file, &block) is Control {

Re: [OBNOXIOUS PATCH] docs/running.pod [APPLIED]

2002-01-18 Thread Dan Sugalski
At 9:30 AM -0800 1/15/02, Steve Fink wrote: >This patch add docs/running.pod, which lists the various executables >Parrot currently includes, examples of running them, and mentions of >where they fail to work. It's more of a cry for help than a useful >reference. :-) I've been having trouble recen

Re: [PATCH] gcc -ansi -pedantic unrealistically strict [APPLIED]

2002-01-18 Thread Dan Sugalski
At 12:51 PM -0500 1/15/02, Andy Dougherty wrote: >I think the optimal fix here is simply to remove -ansi -pedantic. >-ansi may well have some uses, but even the gcc man pages say >"There is no reason to use this option [-pedantic]; it exists only >to satisfy pedants." Applied. thanks. (Though I h

Re: on parrot strings

2002-01-18 Thread Jarkko Hietaniemi
> > > We *do* want to have (with some notation) > > > [[:digit:]\p{FunkyLooking}aeiou except 7], right? > > > > Of course. But that is all resolvable in regex compile time. > > No expression tree needed. > > My point was that if inversion lists are insufficient for describing > all the characte

Re: Apo4: PRE, POST

2002-01-18 Thread Me
> [concerns over conflation of post-processing and post-assertions] Having read A4 thoroughly, twice, this was my only real concern (which contrasted with an overall sense of "wow, this is so cool"). --me

Re: on parrot strings

2002-01-18 Thread Steve Fink
On Sat, Jan 19, 2002 at 12:28:15AM +0200, Jarkko Hietaniemi wrote: > On Fri, Jan 18, 2002 at 02:22:49PM -0800, Steve Fink wrote: > > On Sat, Jan 19, 2002 at 12:11:06AM +0200, Jarkko Hietaniemi wrote: > > > Complement of an inversion list is neat: insert 0 at the beginning > > > (and append max+1),

Re: on parrot strings

2002-01-18 Thread Jarkko Hietaniemi
On Fri, Jan 18, 2002 at 02:22:49PM -0800, Steve Fink wrote: > On Sat, Jan 19, 2002 at 12:11:06AM +0200, Jarkko Hietaniemi wrote: > > Complement of an inversion list is neat: insert 0 at the beginning > > (and append max+1), unless there already is one, in which case delete > > the 0 (and shift the

Re: on parrot strings

2002-01-18 Thread Steve Fink
On Sat, Jan 19, 2002 at 12:11:06AM +0200, Jarkko Hietaniemi wrote: > Complement of an inversion list is neat: insert 0 at the beginning > (and append max+1), unless there already is one, in which case delete > the 0 (and shift the list and delete the max+1). Again, O(N). > (One could of course h

Re: Does this mean we get Ruby/CLU-style iterators?

2002-01-18 Thread Dan Sugalski
At 3:37 PM + 1/18/02, Piers Cawley wrote: >Michael G Schwern <[EMAIL PROTECTED]> writes: > >Hmm... making up some syntax on the fly. I sort of like the idea of >being able to do > > class File; > sub foreach ($file, &block) is Control { > # 'is Control' declares this as a contr

Re: on parrot strings

2002-01-18 Thread Jarkko Hietaniemi
On Fri, Jan 18, 2002 at 01:40:26PM -0800, Steve Fink wrote: > On Fri, Jan 18, 2002 at 10:08:40PM +0200, Jarkko Hietaniemi wrote: > > ints, or 176 bytes. Searching for membership in an inversion list is > > O(N log N) (binary search). "Encoding the whole range" is a non-issue > > bordering on a jo

RE: Apo4: PRE, POST

2002-01-18 Thread Garrett Goebel
From: David Whipp [mailto:[EMAIL PROTECTED]] > > Apo4, when introducing POST, mentions that there is a > corresponding "PRE" block "for design-by-contract > programmers". > > However, I see the POST block being used as a finalize; > and thus allowing (encouraging?) it to have side effects. It m

Re: on parrot strings

2002-01-18 Thread Jarkko Hietaniemi
On Fri, Jan 18, 2002 at 01:40:26PM -0800, Steve Fink wrote: > On Fri, Jan 18, 2002 at 10:08:40PM +0200, Jarkko Hietaniemi wrote: > > ints, or 176 bytes. Searching for membership in an inversion list is > > O(N log N) (binary search). "Encoding the whole range" is a non-issue > > bordering on a jo

Re: on parrot strings

2002-01-18 Thread Steve Fink
On Fri, Jan 18, 2002 at 10:08:40PM +0200, Jarkko Hietaniemi wrote: > ints, or 176 bytes. Searching for membership in an inversion list is > O(N log N) (binary search). "Encoding the whole range" is a non-issue > bordering on a joke: two ints, or 8 bytes. [Clarification from a noncombatant] You m

Apo4: PRE, POST

2002-01-18 Thread David Whipp
Apo4, when introducing POST, mentions that there is a corresponding "PRE" block "for design-by-contract programmers". However, I see the POST block being used as a finalize; and thus allowing (encouraging?) it to have side effects. I can't help feeling that contract/assertion checking should not

Re: Ex4, Apo5, when ?

2002-01-18 Thread Dan Sugalski
At 4:17 PM -0500 1/18/02, Michael G Schwern wrote: >On Fri, Jan 18, 2002 at 03:35:59PM -0500, Dan Sugalski wrote: >> At 10:16 AM +0200 1/18/02, raptor wrote: >> >Did u passed "Bermuda Triangle" :") >> >> It may be a bit before Ex4 is done. Damian's on a cruise ship at the >> moment, so even if

Re: Ex4, Apo5, when ?

2002-01-18 Thread Michael G Schwern
On Fri, Jan 18, 2002 at 03:35:59PM -0500, Dan Sugalski wrote: > At 10:16 AM +0200 1/18/02, raptor wrote: > >Did u passed "Bermuda Triangle" :") > > It may be a bit before Ex4 is done. Damian's on a cruise ship at the > moment, so even if he's got the time (and I don't think he does) he's > like

Re: Ex4, Apo5, when ?

2002-01-18 Thread Dan Sugalski
At 10:16 AM +0200 1/18/02, raptor wrote: >Did u passed "Bermuda Triangle" :") It may be a bit before Ex4 is done. Damian's on a cruise ship at the moment, so even if he's got the time (and I don't think he does) he's likely lacking connectivity. I expect he'll give us word at some point what t

Re: on parrot strings

2002-01-18 Thread Jarkko Hietaniemi
On Fri, Jan 18, 2002 at 12:20:53PM -0800, Hong Zhang wrote: > > > My proposal is we should use mix method. The Unicode standard class, > > > such as \p{IsLu}, can be handled by a standard splitbin table. Please > > > see Java java.lang.Character or Python unicodedata_db.h. I did > > > measurement

RE: on parrot strings

2002-01-18 Thread Hong Zhang
> > My proposal is we should use mix method. The Unicode standard class, > > such as \p{IsLu}, can be handled by a standard splitbin table. Please > > see Java java.lang.Character or Python unicodedata_db.h. I did > > measurement on it, to handle all unicode category, simple casing, > > and decim

Ex4, Apo5, when ?

2002-01-18 Thread raptor
Did u passed "Bermuda Triangle" :") raptor

RE: on parrot strings

2002-01-18 Thread Hong Zhang
> > preprocessing. Another example, if I want to search for /resume/e, > > (equivalent matching), the regex engine can normalize the case, fully > > decompose input string, strip off any combining character, and do 8-bit > > Hmmm. The above sounds complicated not quite what I had in mind > for

Re: on parrot strings

2002-01-18 Thread Jarkko Hietaniemi
On Fri, Jan 18, 2002 at 11:44:00AM -0800, Hong Zhang wrote: > > (1) There are 5.125 bytes in Unicode, not four. > > (2) I think the above would suffer from the same problem as one common > > suggestion, two-level bitmaps (though I think the above would suffer > > less, being of finer granu

Re: on parrot strings

2002-01-18 Thread Jarkko Hietaniemi
> I don't think UTF-32 will save you much. The unicode case map is variable > length, combining character, canonical equivalence, and many other thing > will require variable length mapping. For example, if I only want to This is true. > parse /[0-9]+/, why you want to convert everything to UTF-

RE: on parrot strings

2002-01-18 Thread Hong Zhang
> (1) There are 5.125 bytes in Unicode, not four. > (2) I think the above would suffer from the same problem as one common > suggestion, two-level bitmaps (though I think the above would suffer > less, being of finer granularity): the problem is that a lot of > space is wasted, since t

Re: Apoc 4?

2002-01-18 Thread Dan Sugalski
>Michael G Schwern wrote: > >> Reading this in Apoc 4 ... > >I looked on http://dev.perl.org/perl6/apocalypse/: no sign of Apoc4. Where >do I find this latest installment? www.perl.com. dev.perl.org must just not have a link yet. -- Dan ---

Re: Apoc 4?

2002-01-18 Thread Will Coleda
http://www.perl.com/pub/a/2002/01/15/apo4.html David Whipp wrote: > > Michael G Schwern wrote: > > > Reading this in Apoc 4 ... > > I looked on http://dev.perl.org/perl6/apocalypse/: no sign of Apoc4. Where > do I find this latest installment? > > Dave.

Apoc 4?

2002-01-18 Thread David Whipp
Michael G Schwern wrote: > Reading this in Apoc 4 ... I looked on http://dev.perl.org/perl6/apocalypse/: no sign of Apoc4. Where do I find this latest installment? Dave.

Re: Does this mean we get Ruby/CLU-style iterators?

2002-01-18 Thread Piers Cawley
Michael G Schwern <[EMAIL PROTECTED]> writes: > Reading this in Apoc 4 > > sub mywhile ($keyword, &condition, &block) { > my $l = $keyword.label; > while (&condition()) { > &block(); > CATCH { > my $t = $!.tag; > when X::

Re: on parrot strings

2002-01-18 Thread Jarkko Hietaniemi
> Since I seem to be the main regex hacker for Parrot, I'll respond to > this as best I can. > > Currently, we are using bitmaps for character classes. Well, sort of. > A Bitmap in Parrot is defined like this: > > typedef struct bitmap_t { > char* bmp; >

Re: on parrot strings

2002-01-18 Thread Jarkko Hietaniemi
On Fri, Jan 18, 2002 at 04:51:07AM -0500, Bryan C. Warnock wrote: > Thanks, Jarrko. > > On Thursday 17 January 2002 23:21, Jarkko Hietaniemi wrote: > > The most important message is that give up on 8-bit bytes, already. > > Time to move on, chop chop. > > Do you think/feel/wish/demand that the t

Re: on parrot strings

2002-01-18 Thread Bryan C. Warnock
Thanks, Jarrko. On Thursday 17 January 2002 23:21, Jarkko Hietaniemi wrote: > The most important message is that give up on 8-bit bytes, already. > Time to move on, chop chop. Do you think/feel/wish/demand that the textual (string) APIs should differ from the binary (byte) APIs? (Both from an

Does this mean we get Ruby/CLU-style iterators?

2002-01-18 Thread Michael G Schwern
Reading this in Apoc 4 sub mywhile ($keyword, &condition, &block) { my $l = $keyword.label; while (&condition()) { &block(); CATCH { my $t = $!.tag; when X::Control::next { die if $t && $t ne $l); next } w

RE: on parrot strings

2002-01-18 Thread Brent Dax
Jarkko Hietaniemi: About the implementation of character classes: since the Unicode code point range is big, a single big bitmap won't work any more: firstly, it would be big. Secondly, for most cases, it would be wastefully sparse. A balanced binary tree of (begin, end) points of ranges is sug