Re: Ruminating RFC 93- alphabet-blind pattern matching

2003-04-05 Thread Larry Wall
Um.  Maybe it was just bad writing on my part, but it does not seem
to me that what I already said about RFC 93 in A5 has sunk in at all.

Larry


Re: Ruminating RFC 93- alphabet-blind pattern matching

2003-04-04 Thread Luke Palmer
  use Permutations permutations compositions;
  
  # Generate all strings of length $n
  method Rule::Group::generate(Int $n) {  # Type sprinkles :)
  compositions($n, [EMAIL PROTECTED]) == map {
  my @rets = map { 
  $^atom.generate($^n)
  } zip(@.atoms, $_);
  *permutations([EMAIL PROTECTED])
  }
  }

Oops, I didn't do it right.  That last line should be:

*(map { join '', $_ } permutations([EMAIL PROTECTED])) 

instead of the plain *permutations([EMAIL PROTECTED]).

At first I thought there was a problem if a subpattern returned the
null list, indicating a non-match.  But then, if permutations is
strictly mathematical, it will return the null list if any of its
arguments are the null list anyway.

Plus the joining bit, so we get strings back instead of arrays.

 Pardon my ignorance, here:   
 
 For the example:  /(A*B*(C*|Z+))/.generate(4)
 
  compositions($n, [EMAIL PROTECTED]) == map {
 
 This C [EMAIL PROTECTED]  is supposed to compute the length of the @.atoms
 member-array, right? So that you can call compositions($a,$b) with
 two numbers?

That was my intent.  If compositions imposes numeric context, the +
isn't necessary, but I think it's good for clarity anyway.

 compositions(4,3) = [
   [0, 0, 4],
   [0, 1, 3],
   [0, 2, 2],
:
   [4, 0, 0]
 ]
 
 right?

Yep.

 And then map passes each of the 3-tuples, above, to the block as $_.
 
  my @rets = map { 
  $^atom.generate($^n)
  } zip(@.atoms, $_);
  *permutations([EMAIL PROTECTED])

Yes.  (I suppose I should have parameterized it with - $composition,
or somesuch (or better yet, - @composition)).

 And the block is going to take the @.atoms array, which has (in my
 example) three members: A*, B*, and (C*|Z+); and the 3-tuple, and pair
 each @.atom with a number from the 3-tuple in $_ via the zip function.
 
 A* = 1, B* = 2, (C*|Z+) = 1,

Yes.  (I loathe the name zip, FWIW.  dead_cat would have been more
descriptive... :)

 And map will run A*.generate(1), B*.generate(2), (C*|Z+).generate(1)
 because the ^atom and ^n are alphabetically the first and second
 arguments. (Note: I had thought that currying was limited to one-char
 names, but I can't find anything right now about it...I like multichar
 names better [as in this example], but it's going to take a lot of
 practice to learn to watch out for that '^'.)
 
 And then presumably the results would be:
 
 A* = [ 'A' ]  # Array of one, for compatibility (below)
 B* = [ 'BB' ] # ditto
 (C*|Z+) = [ 'C', 'Z' ]

[...]

 Continuing:
 
 my @rets = [ ['A'], ['BB'], ['C', 'Z'] ]
 
 When you say 
 
  *permutations([EMAIL PROTECTED])
 
 The [EMAIL PROTECTED] is supposed to flatten the array once, right? So that the
 effect was as if you had stripped off the outer [ ... ] above.
 
 And permutations presumably iterates over each possible setting for
 each independent argument.

That was my intention.  I don't think this is strictly a
permutations operation, but I couldn't think of what it was.

 sub permutations([EMAIL PROTECTED], [EMAIL PROTECTED])
 {
   @arg1 == map - $arg {
  map { $arg, *$_ } permutations(@rest);
   }; # -- Is the semicolon needed?
 }

Semicolon isn't needed.  It's a separator, not a terminator.

That is a rockin' definition of Cpermutations. :)

 Which should generate
 [   ['A', 'BB', 'C', ]   ,['A', 'BB', 'Z', ]  ]
 
 and then strip off the [ , ] because of the * in *permutations,
 yielding the list.
 
 The final strippage is because the *permutations() is making up only a
 single component of the product of the map output, right? In order to
 get
 
 [ [''], ..., [ 'A', 'BB', 'C', ], ['A', 'BB', 'Z'], ... ]
 
 you flatten once to glue them together?

Umm, the new code does it better.

[...]
 
 =Austin


Re: Ruminating RFC 93- alphabet-blind pattern matching

2003-04-04 Thread Edward Peschko
On Thu, Apr 03, 2003 at 07:30:10AM -0700, Luke Palmer wrote:
  just an aside, and a bit off-topic, but has anybody considered
  hijacking the regular expression engine in perl6 and turning it into
  its opposite, namely making *productions* of strings/sounds/whatever
  that could possibly match the regular expression? ie:
  
  a*
  
  producing
  
  ''
  a
  aa
  aaa
  
  
  etc.
  
  I could think of lots of uses for this:
  
  1) test data for databases and programs.
  2) input for genetic algorithms/code generators
  3) semantic rules/production of random sentences.
  
  In fact, I foresee a combo of 2,3 and some expert system somewhere
  producing the first sentient perl program. ;-)
 
 Yeah, it seems like a neat idea.  It is if you generate it
 right... but, fact is, you probably won't.  For anything that's more
 complex than your /a*/ example, it breaks down (well, mostly):
 
 /\w+: \d+/
 
 Would most likely generate:
 
 a: 0
 a: 00
 a: 000
 a: 
 
 Or:
 
 a: 0
 a: 1
 ...
 a: 9
 a: 00
 a: 01
 
 ad infinitum, never getting to even aa: .*

But that's the point - I don't want it to be just able to generate all possibilities,
I want it to be able to generate a subset of valid possibilities. And have:

a) a default heuristic for doing so, based on a regex
b) user defined heuristics for doing so

Although I disagree with you on the idea that it has no uses as is  - generating all
possible combinations. You could do:

my @list is Regex::Generator(/([1-6])([1-6^\1])([1-6^\1\2])/)

to return a list of all combinations of numbers between 1 and 6 and:

my @words = qw( word list number one );
my @words2 = qw( word list number two );

my @list is Regex::Generator(/ (@words) (@words2) /);

to generate all possible combinations of words. You could also test hard to understand
rexen by simplifying and generating all possible combinations:

my $_doublestring = q$(?:\(?[^\\\]+|\\\.)*\)$; 

becomes

my $_doublestring = q$(?:\(?[notdq]+|\\\)*\)$; 

to generate:


n
o
t
...
\

 
 But I guess then you'd see a lot more quantifiers and such.
 
 /\w+8: \d4/

or substituting \w for something more manageable like [a-f] and \d for [1-2].

 Is finite (albeit there are 63**8 * 10**4 == 2,481,557,802,675,210,000
 combinations).  References to the heat death of the universe, anyone?
 
 And then there's Unicode. %-/

 In reality, I don't think it would be that useful.  Theoretically,
 though, you *can* look inside the regex parse tree and create a
 generator out of it... so, some module, somewhere.

Of course, it would need a little elbow grease to be truly useful. The syntax for
making heuristics in generating useful productions would take some work. But I can 
think 
of a dozen uses for it.

Ex: Right now, I'm writing a generator to generate sample programming problems - for a 
book I'm writing. It spits out both the problem, and the code to answer the problem.. 
Using a production engine like the one above, and this problem generator becomes 20 
lines of code.

Ed




Re: Ruminating RFC 93- alphabet-blind pattern matching

2003-04-04 Thread Luke Palmer
 Luke Palmer [EMAIL PROTECTED] writes:
 
  On Thu, Apr 03, 2003 at 07:29:37AM -0800, Austin Hastings wrote:
  This has been alluded to before.
  
  What would /A*B*/ produce?
  
  Because if you were just processing the rex, I think you'd have to
  finish generating all possibilities of A* before you began iterating
  over B*...
  
  The proper way would be to first produce all possibilities of length n 
  before giving any possibility of length n+1.
  
  ''
  'A'
  'B'
  'AA'
  'AB'
  'BB'
  'AAA'
  'AAB'
  ...
  
  I haven't spent a milisecond of working out whether that's feasible to 
  implement, but from a theoretical POV it seems like the solution.
 
  Well, I'm not certain there is really a proper way.  But sure, your
  way is doable.
 
  use Permutations permutations compositions;
 
  # Generate all strings of length $n
  method Rule::Group::generate(Int $n) {  # Type sprinkles :)
  compositions($n, [EMAIL PROTECTED]) == map {
  my @rets = map { 
  $^atom.generate($^n)
  } zip(@.atoms, $_);
  *permutations([EMAIL PROTECTED])
  }
  }
 
  How's that for A4 and A6 in a nutshell, implementing an A5 conept? :)
  I hope I got it right
 
 For bonus points:
 
 method Rule::Group::generate_all($self:) {
 for 1 .. Inf - $i {
 yield $_ for $self.generate($i);
 }
 }
 
 Hmm... I wonder if there's a way of making the basic 'generate' method
 lazy too.

Well, it might be already, based on some ideas that have come up here
before (for instance, map being lazy and generating a lazy list
when it can).  If compositions is lazy, then generate would be too.

Or, I want it to work that way, at least.  :)

Luke


Re: Ruminating RFC 93- alphabet-blind pattern matching

2003-04-04 Thread arcadi shehter
Yary Hluchan writes:
  a = arcadi shehter [EMAIL PROTECTED]
  aI think this was already discussed once and then it was proposed to
  aattach a property to characters of the string
  a
  a sub peek_at_sky {
  a
  a my Color  @numbers = peek_with_some_hardware;
  a
  a my $say_it =  join map { 1 but color($_) } @numbers ;
  a return $say_it ;
  a }
  a
  a
  a   rule color { (.) { let $0 := $1.color } }
  a
  a   $daylight = peek_at_sky =~ /color+/; # is something in sky
  
  That works and isn't too bad, a quick fix with some interesting
  possibilities. Should be an example in the documentation. Still,
  the RFC that opened this discussion opens a different way-
  
  http://dev.perl.org/rfc/93.html
  http://www.perl.com/pub/a/2002/06/04/apo5.html?page=17#rfc%20093:%20regex:%20support%20for%20incremental%20pattern%20matching
  
  Once a user-defined sub can hand characters to rexen- it could hand
  anything over (floats, refs, etcetera).  It's an opportunity ripe
  for exploitation.
  


sorry , it was proposed to be like that 

 sub peek_at_sky {

 my Color  @numbers = peek_with_some_hardware;
 my Str@words = map { 1 but color($_) } @numbers 

 my $say_it is from( @words )  ;

 return $say_it ;
 }

   rule color { (.) { let $0 := $1.color } }
   $daylight = peek_at_sky =~ /color+/; # is something in sky




Re: Ruminating RFC 93- alphabet-blind pattern matching

2003-04-04 Thread Joseph F. Ryan
Luke Palmer wrote:

On Thu, Apr 03, 2003 at 07:29:37AM -0800, Austin Hastings wrote:
   

This has been alluded to before.

What would /A*B*/ produce?

Because if you were just processing the rex, I think you'd have to
finish generating all possibilities of A* before you began iterating
over B*...
 

The proper way would be to first produce all possibilities of length n 
before giving any possibility of length n+1.

''
'A'
'B'
'AA'
'AB'
'BB'
'AAA'
'AAB'
...
I haven't spent a milisecond of working out whether that's feasible to 
implement, but from a theoretical POV it seems like the solution.
   

Well, I'm not certain there is really a proper way.  But sure, your
way is doable.
   use Permutations permutations compositions;

   # Generate all strings of length $n
   method Rule::Group::generate(Int $n) {  # Type sprinkles :)
   compositions($n, [EMAIL PROTECTED]) == map {
   my @rets = map { 
   $^atom.generate($^n)
   } zip(@.atoms, $_);
   *permutations([EMAIL PROTECTED])
   }
   }

How's that for A4 and A6 in a nutshell, implementing an A5 conept? :)
I hope I got it right
Provided each other kind of rx element implemented generate, that
returned all generated strings of length $n, which might be zero.
This would be trivial for most other atoms and ops (I think).
Oh, compositions($a,$b) is a function that returns all lists of length
$b whose elements sum to $a.  Yes, it exists.
I have a couple syntax questions about this if anyone knows the answers:

   $^atom.generate($^n)

I want @rets to be an array of array refs.  Do I have to explicitly
take the reference of that, or does that work by itself?
   zip(@.atoms, $_)

I want the array ref in $_ to be zipped up with @.atoms as if $_ were
a real array.  If this Iis correct, am I allowed to say:
   zip(@.atoms, @$_)

for documentation?

Also, related to the first question:

   *permutations([EMAIL PROTECTED])

Does that interpolate the returned list from permutations right into
the map's return, a la Perl5?  Do I need the * ?
As far all of these questions, I think the answer is related.  I think
the general question is Is implicit flattening needed for perl6
builtins?  I think that the answer is no, or at least should be no,
because it won't be hard to get the builtins to DWIM because of
multimethods.
For instance, an implementation of map might be:

sub *map (code, Array @array) {
   return @array.map(code);
}
sub *map (code, [EMAIL PROTECTED]) {
   my @ret;
   for @rest {
   @ret.push( code.($_) );
   }
   return @ret;
}
So, given an Array/Array Subclass/Reference to one of the two as the
2nd argument to map, map would call the method version of map;
otherwise, the arguments after the code block are flattened and
looped over.
This behaivor should be consistant across all of the perl6 builtins.

Joseph F. Ryan
[EMAIL PROTECTED]


Re: Ruminating RFC 93- alphabet-blind pattern matching

2003-04-04 Thread Joseph F. Ryan
Joseph F. Ryan wrote:

Luke Palmer wrote:

On Thu, Apr 03, 2003 at 07:29:37AM -0800, Austin Hastings wrote:
  

This has been alluded to before.

What would /A*B*/ produce?

Because if you were just processing the rex, I think you'd have to
finish generating all possibilities of A* before you began iterating
over B*...

The proper way would be to first produce all possibilities of 
length n before giving any possibility of length n+1.

''
'A'
'B'
'AA'
'AB'
'BB'
'AAA'
'AAB'
...
I haven't spent a milisecond of working out whether that's feasible 
to implement, but from a theoretical POV it seems like the solution.
  


Well, I'm not certain there is really a proper way.  But sure, your
way is doable.
   use Permutations permutations compositions;

   # Generate all strings of length $n
   method Rule::Group::generate(Int $n) {  # Type sprinkles :)
   compositions($n, [EMAIL PROTECTED]) == map {
   my @rets = map {$^atom.generate($^n)
   } zip(@.atoms, $_);
   *permutations([EMAIL PROTECTED])
   }
   }
How's that for A4 and A6 in a nutshell, implementing an A5 conept? :)
I hope I got it right
Provided each other kind of rx element implemented generate, that
returned all generated strings of length $n, which might be zero.
This would be trivial for most other atoms and ops (I think).
Oh, compositions($a,$b) is a function that returns all lists of length
$b whose elements sum to $a.  Yes, it exists.
I have a couple syntax questions about this if anyone knows the answers:

   $^atom.generate($^n)

I want @rets to be an array of array refs.  Do I have to explicitly
take the reference of that, or does that work by itself?
   zip(@.atoms, $_)

I want the array ref in $_ to be zipped up with @.atoms as if $_ were
a real array.  If this Iis correct, am I allowed to say:
   zip(@.atoms, @$_)

for documentation?

Also, related to the first question:

   *permutations([EMAIL PROTECTED])

Does that interpolate the returned list from permutations right into
the map's return, a la Perl5?  Do I need the * ?
As far all of these questions, I think the answer is related.  I think
the general question is Is implicit flattening needed for perl6
builtins?  I think that the answer is no, or at least should be no,
because it won't be hard to get the builtins to DWIM because of
multimethods.
For instance, an implementation of map might be:

sub *map (code, Array @array) {
   return @array.map(code);
}
sub *map (code, [EMAIL PROTECTED]) {
   my @ret;
   for @rest {
   @ret.push( code.($_) );
   }
   return @ret;
} 


Except that it should be:

multi *map (code, Array @array) {
  return @array.map(code);
}
multi *map (code, [EMAIL PROTECTED]) {
  my @ret;
  for @rest {
  @ret.push( code.($_) );
  }
  return @ret;
}
I swear, my brain must hate me; I always overlook the most obvious 
mistakes. (-:

Joseph F. Ryan
[EMAIL PROTECTED]



Re: Ruminating RFC 93- alphabet-blind pattern matching

2003-04-04 Thread Joseph F. Ryan
Yary Hluchan wrote:

making *productions* of strings/sounds/whatever that could possibly
match the regular expression?

Correct me if I am wrong, but isn't this the :any switch of apoc 5?
http://www.perl.com/pub/a/2002/06/26/synopsis5.html
Not really, unless the input string is infinite!



Well, thats just in the general purpose case, right?  That's because
a regex like /a*/ matches:
'w'
'qsdf'
'i bet you didn't wnt this to mtch'
So, you're going to need some sort of controlled input to a regex match
with the :any for it to work right.
Here's my approach to the problem: generate a possible string that
could match every atom in the regex individually, and then generate
matches for the whole regex off of that.  I liked Luke's approach
of stapling methods onto the Rx classes, so I used an approach that
made use of that idea.  I completed each of the needed rules, since
the methods in my example are pretty simple (they probably would be
in Luke's example too, but I just wanted to be sure I wasn't missing
anything).
   use List::Permutations permutations; # Perl 5's name.

   sub generate (rx $spec, Int $limit) {
   my $string = $spec.generate_match (propagate, $limit);
   $string =~ m:any/ ($spec) { yield $1 } /;

   my sub propagate ($atom) {
   given ($atom) {
   when Perl::sv_literal {
   $string ~= $_.literal()
   }
   when Perl::Rx {
   $string ~= .generate_match (propagate, $limit)
   if .isa(generate_match)
   }
   }
   }
   }

   Perl::Rx::Atom::generate_match (p, $limit) {
   return p.($.atom)
   }
   Perl::Rx::Zerowidth::generate_match (p, $limit) {
   return p.($.atom)
   }
   Perl::Rx::Meta::generate_match (p, $limit) {
   return join '', $.possible
   }
   Perl::Rx::Oneof::generate_match (p, $limit) {
   return join '', $.possible
   }
   Perl::Rx::Charclass::generate_match (p, $limit) {
   return join '', $.possible
   }
   Perl::Rx::Sequence::generate_match (p, $limit) {
   my $string;
   $string ~= p.($_) for $.atoms;
   return $string;
   }
   Perl::Rx::Alternation::generate_match (p, $limit) {
   my $string;
   $string ~= p.($_) for $.branches;
   return $string;
   }
   Perl::Rx::Modifier::generate_match (p, $limit) {
   my $string;
   $string ~= p.($_) for $.atoms;
   # is $self ($.) still the topic here?  or is the last
   # member of $.atoms?
   return $self.mod.transform($string);
   }
   Perl::Rx::Modifier::repeat (p, $limit) {
   $string := join '', map { join '', $_ }
   permutations (split //, p.($.atom)) xx ($.max // $limit);
   return $string;
   }
So, given a call like:

   generate (/(A*B*(C*|Z+))/, 4);
  
The C$string variable in the 2nd line of Cgenerate would become:

   

And the :any switch takes care of the rest. (-:

Joseph F. Ryan
[EMAIL PROTECTED]


Re: Ruminating RFC 93- alphabet-blind pattern matching

2003-04-03 Thread Edward Peschko
 What I think you're looking for is the fact that they're not regexes any more. They 
 are  rexen, but in horrifying-secret-reality, what has happened is that Larry's 
 decided
 to move Fortran out of core, and replace it with yacc.

just an aside, and a bit off-topic, but has anybody considered hijacking the regular 
expression engine in perl6 and turning it into its opposite, namely making 
*productions*
of strings/sounds/whatever that could possibly match the regular expression? ie:

a*

producing

''
a
aa
aaa


etc.

I could think of lots of uses for this:

1) test data for databases and programs.
2) input for genetic algorithms/code generators
3) semantic rules/production of random sentences.

In fact, I foresee a combo of 2,3 and some expert system somewhere producing the first 
sentient perl program. ;-)

Ed

(
ps: As for the 'rexen' concept of matching stuff other than characters, hell, that's a 
*wonderful* idea. And if you turned the regex around so that you could (in a 
meaningful 
way) make productions from it for stuff other than characters, you could make random 
http requests, music, GUI requests/interactions database connections and so forth. 

Its a code testers' dream...

Now, just got to think of the syntax for it.. how to make it usable.
)


Re: Ruminating RFC 93- alphabet-blind pattern matching

2003-04-03 Thread Luke Palmer
 just an aside, and a bit off-topic, but has anybody considered
 hijacking the regular expression engine in perl6 and turning it into
 its opposite, namely making *productions* of strings/sounds/whatever
 that could possibly match the regular expression? ie:
 
 a*
 
 producing
 
 ''
 a
 aa
 aaa
 
 
 etc.
 
 I could think of lots of uses for this:
 
   1) test data for databases and programs.
   2) input for genetic algorithms/code generators
   3) semantic rules/production of random sentences.
 
 In fact, I foresee a combo of 2,3 and some expert system somewhere
 producing the first sentient perl program. ;-)

Yeah, it seems like a neat idea.  It is if you generate it
right... but, fact is, you probably won't.  For anything that's more
complex than your /a*/ example, it breaks down (well, mostly):

/\w+: \d+/

Would most likely generate:

a: 0
a: 00
a: 000
a: 

Or:

a: 0
a: 1
...
a: 9
a: 00
a: 01

ad infinitum, never getting to even aa: .*

But I guess then you'd see a lot more quantifiers and such.

/\w+8: \d4/

Is finite (albeit there are 63**8 * 10**4 == 2,481,557,802,675,210,000
combinations).  References to the heat death of the universe, anyone?

And then there's Unicode. %-/

In reality, I don't think it would be that useful.  Theoretically,
though, you *can* look inside the regex parse tree and create a
generator out of it... so, some module, somewhere.

 Now, just got to think of the syntax for it.. how to make it usable.

That's easy:

use Regex::Generator;

my @list is Regex::Generator(/a*/);
for @list {
dostuff($_)
}

That or an iterator.

Luke


Re: Ruminating RFC 93- alphabet-blind pattern matching

2003-04-03 Thread Austin Hastings
--- Edward Peschko [EMAIL PROTECTED] wrote:
  What I think you're looking for is the fact that they're not
 regexes any more. They are  rexen, but in
 horrifying-secret-reality, what has happened is that Larry's decided
  to move Fortran out of core, and replace it with yacc.
 
 just an aside, and a bit off-topic, but has anybody considered
 hijacking the regular 
 expression engine in perl6 and turning it into its opposite, namely
 making *productions*
 of strings/sounds/whatever that could possibly match the regular
 expression? ie:
 
 a*
 
 producing
 
 ''
 a
 aa
 aaa
 
 
 etc.
 

This has been alluded to before.

What would /A*B*/ produce?

Because if you were just processing the rex, I think you'd have to
finish generating all possibilities of A* before you began iterating
over B*...

=Austin

 I could think of lots of uses for this:
 
   1) test data for databases and programs.
   2) input for genetic algorithms/code generators
   3) semantic rules/production of random sentences.
 
 In fact, I foresee a combo of 2,3 and some expert system somewhere
 producing the first 
 sentient perl program. ;-)
 
 Ed
 
 (
 ps: As for the 'rexen' concept of matching stuff other than
 characters, hell, that's a 
 *wonderful* idea. And if you turned the regex around so that you
 could (in a meaningful 
 way) make productions from it for stuff other than characters, you
 could make random 
 http requests, music, GUI requests/interactions database connections
 and so forth. 
 
 Its a code testers' dream...
 
 Now, just got to think of the syntax for it.. how to make it usable.
 )



Re: Ruminating RFC 93- alphabet-blind pattern matching

2003-04-03 Thread Matthijs van Duin
On Thu, Apr 03, 2003 at 07:29:37AM -0800, Austin Hastings wrote:
This has been alluded to before.

What would /A*B*/ produce?

Because if you were just processing the rex, I think you'd have to
finish generating all possibilities of A* before you began iterating
over B*...
The proper way would be to first produce all possibilities of length n 
before giving any possibility of length n+1.

''
'A'
'B'
'AA'
'AB'
'BB'
'AAA'
'AAB'
...
I haven't spent a milisecond of working out whether that's feasible to 
implement, but from a theoretical POV it seems like the solution.

--
Matthijs van Duin  --  May the Forth be with you!


Re: Ruminating RFC 93- alphabet-blind pattern matching

2003-04-03 Thread Luke Palmer
 On Thu, Apr 03, 2003 at 07:29:37AM -0800, Austin Hastings wrote:
 This has been alluded to before.
 
 What would /A*B*/ produce?
 
 Because if you were just processing the rex, I think you'd have to
 finish generating all possibilities of A* before you began iterating
 over B*...
 
 The proper way would be to first produce all possibilities of length n 
 before giving any possibility of length n+1.
 
 ''
 'A'
 'B'
 'AA'
 'AB'
 'BB'
 'AAA'
 'AAB'
 ...
 
 I haven't spent a milisecond of working out whether that's feasible to 
 implement, but from a theoretical POV it seems like the solution.

Well, I'm not certain there is really a proper way.  But sure, your
way is doable.

use Permutations permutations compositions;

# Generate all strings of length $n
method Rule::Group::generate(Int $n) {  # Type sprinkles :)
compositions($n, [EMAIL PROTECTED]) == map {
my @rets = map { 
$^atom.generate($^n)
} zip(@.atoms, $_);
*permutations([EMAIL PROTECTED])
}
}

How's that for A4 and A6 in a nutshell, implementing an A5 conept? :)
I hope I got it right

Provided each other kind of rx element implemented generate, that
returned all generated strings of length $n, which might be zero.
This would be trivial for most other atoms and ops (I think).

Oh, compositions($a,$b) is a function that returns all lists of length
$b whose elements sum to $a.  Yes, it exists.

I have a couple syntax questions about this if anyone knows the answers:

$^atom.generate($^n)

I want @rets to be an array of array refs.  Do I have to explicitly
take the reference of that, or does that work by itself?

zip(@.atoms, $_)

I want the array ref in $_ to be zipped up with @.atoms as if $_ were
a real array.  If this Iis correct, am I allowed to say:

zip(@.atoms, @$_)

for documentation?

Also, related to the first question:

*permutations([EMAIL PROTECTED])

Does that interpolate the returned list from permutations right into
the map's return, a la Perl5?  Do I need the * ?

Wow, 5 unobfuscated friggin lines!  This language is gorgeous!

Luke


Re: Ruminating RFC 93- alphabet-blind pattern matching

2003-04-03 Thread arcadi shehter
Austin Hastings writes:
  
  On the other hand, let's suppose that you've got a vast array of
  floating point data:
  
  my float @seti = {...evidence of intelligence, somewhere...};
  
  It's a fair question to ask how to retarget the rexengine to use @seti
  as the input stream. (I hereby declare that if anyone ever writes a
  grammar to do stock-picking, I thunk it first! :-)
  
  I'm guessing that the right way is to replace the low-level operators,
  but what are they?
  
rule color { (.) ( $1.isa(Colorific) ) }
$daylight = peek_at_sky =~ /color/; # is something in sky
   Colorific?
  
  Alternatively, there may be a lower-level stream object that could be
  replaced:
  
  grammar Rainbow
  {
let Rex::get_one := read_float_from_array;
  
# ...
  }
  

I think this was already discussed once and then it was proposed to 
attach a property to characters of the string 

 sub peek_at_sky {
 
 my Color  @numbers = peek_with_some_hardware; 

 my $say_it =  join map { 1 but color($_) } @numbers ; 
 return $say_it ; 
 }


   rule color { (.) { let $0 := $1.color } }   

   $daylight = peek_at_sky =~ /color+/; # is something in sky
 


arcadi 
   


Re: Ruminating RFC 93- alphabet-blind pattern matching

2003-04-03 Thread Yary Hluchan
a = arcadi shehter [EMAIL PROTECTED]
aI think this was already discussed once and then it was proposed to
aattach a property to characters of the string
a
a sub peek_at_sky {
a
a my Color  @numbers = peek_with_some_hardware;
a
a my $say_it =  join map { 1 but color($_) } @numbers ;
a return $say_it ;
a }
a
a
a   rule color { (.) { let $0 := $1.color } }
a
a   $daylight = peek_at_sky =~ /color+/; # is something in sky

That works and isn't too bad, a quick fix with some interesting
possibilities. Should be an example in the documentation. Still,
the RFC that opened this discussion opens a different way-

http://dev.perl.org/rfc/93.html
http://www.perl.com/pub/a/2002/06/04/apo5.html?page=17#rfc%20093:%20regex:%20support%20for%20incremental%20pattern%20matching

Once a user-defined sub can hand characters to rexen- it could hand
anything over (floats, refs, etcetera).  It's an opportunity ripe
for exploitation.

-y

~

The Moon is Waxing Crescent (4% of Full)


Re: Ruminating RFC 93- alphabet-blind pattern matching

2003-04-03 Thread Austin Hastings
This is a big long post containing essentially me scratching my head at
Luke's code. Since Uri asked yesterday for a tutorial-type explanation
of some of the syntax, and since I wanted to scream and ask the same
thing of Luke today when I first read his 5 unobfuscated friggin
lines, I'm putting it on the list. But chances are you already know
the answers to all the questions I'm asking, and can skip this post
safely.

--- Luke Palmer [EMAIL PROTECTED] wrote:
  On Thu, Apr 03, 2003 at 07:29:37AM -0800, Austin Hastings wrote:
  This has been alluded to before.
  
  What would /A*B*/ produce?
  
  Because if you were just processing the rex, I think you'd have to
  finish generating all possibilities of A* before you began
 iterating
  over B*...
  
  The proper way would be to first produce all possibilities of
 length n 
  before giving any possibility of length n+1.
  
  ''
  'A'
  'B'
  'AA'
  'AB'
  'BB'
  'AAA'
  'AAB'
  ...
  
  I haven't spent a milisecond of working out whether that's feasible
 to 
  implement, but from a theoretical POV it seems like the solution.
 
 Well, I'm not certain there is really a proper way.  But sure, your
 way is doable.
 
 use Permutations permutations compositions;
 
 # Generate all strings of length $n
 method Rule::Group::generate(Int $n) {  # Type sprinkles :)
 compositions($n, [EMAIL PROTECTED]) == map {
 my @rets = map { 
 $^atom.generate($^n)
 } zip(@.atoms, $_);
 *permutations([EMAIL PROTECTED])
 }
 }
 
 How's that for A4 and A6 in a nutshell, implementing an A5 conept? :)
 I hope I got it right
 
 Provided each other kind of rx element implemented generate, that
 returned all generated strings of length $n, which might be zero.
 This would be trivial for most other atoms and ops (I think).
 
 Oh, compositions($a,$b) is a function that returns all lists of
 length
 $b whose elements sum to $a.  Yes, it exists.
 

Pardon my ignorance, here:   

For the example:  /(A*B*(C*|Z+))/.generate(4)

 compositions($n, [EMAIL PROTECTED]) == map {

This C [EMAIL PROTECTED]  is supposed to compute the length of the @.atoms
member-array, right? So that you can call compositions($a,$b) with
two numbers?

compositions(4,3) = [
  [0, 0, 4],
  [0, 1, 3],
  [0, 2, 2],
   :
  [4, 0, 0]
]

right?

And then map passes each of the 3-tuples, above, to the block as $_.

 my @rets = map { 
 $^atom.generate($^n)
 } zip(@.atoms, $_);
 *permutations([EMAIL PROTECTED])

And the block is going to take the @.atoms array, which has (in my
example) three members: A*, B*, and (C*|Z+); and the 3-tuple, and pair
each @.atom with a number from the 3-tuple in $_ via the zip function.

A* = 1, B* = 2, (C*|Z+) = 1,

And map will run A*.generate(1), B*.generate(2), (C*|Z+).generate(1)
because the ^atom and ^n are alphabetically the first and second
arguments. (Note: I had thought that currying was limited to one-char
names, but I can't find anything right now about it...I like multichar
names better [as in this example], but it's going to take a lot of
practice to learn to watch out for that '^'.)

And then presumably the results would be:

A* = [ 'A' ]  # Array of one, for compatibility (below)
B* = [ 'BB' ] # ditto
(C*|Z+) = [ 'C', 'Z' ]

 I have a couple syntax questions about this if anyone knows the
 answers:
 
 $^atom.generate($^n)
 
 I want @rets to be an array of array refs.  Do I have to explicitly
 take the reference of that, or does that work by itself?

I get the impression that if you don't flatten or append the results,
you'll get the reference automatically.

Continuing:

my @rets = [ ['A'], ['BB'], ['C', 'Z'] ]

When you say 

 *permutations([EMAIL PROTECTED])

The [EMAIL PROTECTED] is supposed to flatten the array once, right? So that the
effect was as if you had stripped off the outer [ ... ] above.

And permutations presumably iterates over each possible setting for
each independent argument.

sub permutations([EMAIL PROTECTED], [EMAIL PROTECTED])
{
  @arg1 == map - $arg {
 map { $arg, *$_ } permutations(@rest);
  }; # -- Is the semicolon needed?
}


Which should generate
[   ['A', 'BB', 'C', ]   ,['A', 'BB', 'Z', ]  ]

and then strip off the [ , ] because of the * in *permutations,
yielding the list.

The final strippage is because the *permutations() is making up only a
single component of the product of the map output, right? In order to
get

[ [''], ..., [ 'A', 'BB', 'C', ], ['A', 'BB', 'Z'], ... ]

you flatten once to glue them together?

 zip(@.atoms, $_)
 
 I want the array ref in $_ to be zipped up with @.atoms as if $_ were
 a real array.  If this Iis correct, am I allowed to say:
 
 zip(@.atoms, @$_)
 
 for documentation?

That way lies madness. 

 
 Also, related to the first question:
 
 *permutations([EMAIL PROTECTED])
 
 Does that interpolate the returned 

Re: Ruminating RFC 93- alphabet-blind pattern matching

2003-04-03 Thread Joseph F. Ryan
Edward Peschko wrote:

What I think you're looking for is the fact that they're not regexes any more. They are  rexen, but in horrifying-secret-reality, what has happened is that Larry's decided
to move Fortran out of core, and replace it with yacc.
   

just an aside, and a bit off-topic, but has anybody considered hijacking the regular 
expression engine in perl6 and turning it into its opposite, namely making *productions*
of strings/sounds/whatever that could possibly match the regular expression? ie:

a*

producing

''
a
aa
aaa

etc.

Correct me if I am wrong, but isn't this the :any switch of apoc 5?
http://www.perl.com/pub/a/2002/06/26/synopsis5.html
Joseph F. Ryan
[EMAIL PROTECTED]


Re: Ruminating RFC 93- alphabet-blind pattern matching

2003-04-03 Thread Yary Hluchan
making *productions* of strings/sounds/whatever that could possibly
match the regular expression?

Correct me if I am wrong, but isn't this the :any switch of apoc 5?
http://www.perl.com/pub/a/2002/06/26/synopsis5.html

Not really, unless the input string is infinite!  :any returns all
substrings of a given string that matches, the productions are
all strings that a regexp can match.

-y

~

The Moon is Waxing Crescent (5% of Full)


Re: Ruminating RFC 93- alphabet-blind pattern matching

2003-04-03 Thread Piers Cawley
Luke Palmer [EMAIL PROTECTED] writes:

 On Thu, Apr 03, 2003 at 07:29:37AM -0800, Austin Hastings wrote:
 This has been alluded to before.
 
 What would /A*B*/ produce?
 
 Because if you were just processing the rex, I think you'd have to
 finish generating all possibilities of A* before you began iterating
 over B*...
 
 The proper way would be to first produce all possibilities of length n 
 before giving any possibility of length n+1.
 
 ''
 'A'
 'B'
 'AA'
 'AB'
 'BB'
 'AAA'
 'AAB'
 ...
 
 I haven't spent a milisecond of working out whether that's feasible to 
 implement, but from a theoretical POV it seems like the solution.

 Well, I'm not certain there is really a proper way.  But sure, your
 way is doable.

 use Permutations permutations compositions;

 # Generate all strings of length $n
 method Rule::Group::generate(Int $n) {  # Type sprinkles :)
 compositions($n, [EMAIL PROTECTED]) == map {
 my @rets = map { 
 $^atom.generate($^n)
 } zip(@.atoms, $_);
 *permutations([EMAIL PROTECTED])
 }
 }

 How's that for A4 and A6 in a nutshell, implementing an A5 conept? :)
 I hope I got it right

For bonus points:

method Rule::Group::generate_all($self:) {
for 1 .. Inf - $i {
yield $_ for $self.generate($i);
}
}

Hmm... I wonder if there's a way of making the basic 'generate' method
lazy too.

-- 
Piers


Re: Ruminating RFC 93- alphabet-blind pattern matching

2003-04-02 Thread Austin Hastings

--- Yary Hluchan [EMAIL PROTECTED] wrote:
 A couple nights ago I read RFC93 as discussed in Apoc. 5 and got
 fired up- it reminded me of some ideas from when I was hacking
 Henry Spencer's regexp package. How to futher generalize regular
 expression input.  It's a bit orthoginal- a properly implemented
 RFC93 make some difficult things easier- whether it's done as
 binding to a sub, or as overloading =~, or whatever.
 
 A very general description of a regular expression, is a program
 that seeks a match within a string of letters.  In perl4 the string
 of letters was a string of bytes, and in perl6 it's a string of
 Unicode (most of the time).
 
 It might as well be a string of *anythings*.  Binding a match against
 a sub is a natural way to get the anythings you want to match.  Now,
 I'm a newbie to perl6, so be patient with my hacked-up examples
 below.
 They won't work in any language. And, for the first I tweaked RFC93:
 
   When the match is finished, the subroutine would be called one
 final
   time, and passed 1 arguments: a flag set to 1, and a list
 containing
   the unused elements
 
 which I admit is a poor interface- but it lets me write:
 
   # Looking for luck- find a run of 3 numbers divisible by 7 or 13
   # sub numerology is simply an interface to an array of integers
   sub numerology { $#_ ? shift,unshift @::nums,@_ : splice
 @::nums,0,@_ }
   numerology =~ / ( !($_[0] % 7 and $_[0] % 13) )3 /;
 
grammar Numerology;

rule number { \b \d+ \b }
rule lucky  { number
  { fail unless ($1 % 7 == 0)  ($1 % 13 == 0); }
}

rule lucky_strike  { :3x lucky } ## Is this right?


 True, it's easy to join integers with spaces and write an equivalent
 regexp on the result- but why stringify when you don't have to?
 
 I'm running into trouble here- using ( code ) to match against a
 single atom (a number), it should be more character classy.  
 Assertions are flexible enough to match all sorts of non-letter 
 atoms, can write a grammer to make it more readable- maybe something
 like
   numerology =~ /  divisible(7)divisible(13) 3 /;

Actually, this is a good argument for nested rules, and thereby for
nested subs:

my numerology = rx/
   rule number { \b \d+ \b }
   rule divisible($by) { (number) :: { fail if ($1 % $by); }}
   
   :x3 all(divisible(7), divisible(13))
/;


 Another example.  Let's say there's a class that deals with colors.
 It has an operator that returns true if two colors look about the 
 same. Given a list of color objects, is there a regexp to find a
 rainbow? Even if the color class doesn't support stringification? 

Yes.

grammar Rainbow;

rule color {...};  # this one's on you.

rule same_color($color is Colorific)
{
  color ::: { fail unless $1.looks_like($color); }
}

rule band($color is Colorific)
{
  same_color($color)+
}

rule Rainbow
{
  band(new Color(red))
  band(new Color(orange))
  band(new Color(yellow))
  band(new Color(green))
  band(new Color(blue))
  band(new Color(indigo))
  band(new Color(violet))
  pot_o_gold?
}

 A less fanciful example- scan a sound. A very crude beat-finding
 regexp- 
  fetch_sound_frames =~
   / (   # store soundclip (array of frames)
 in $1
  (volume(-40db)50,1500) # quietish section, 50-1500 frames
  (volume(-15db)+) # Followed by some loud frame(s)
 )   # End capture of the first beat
 
 before # Make sure the loud/quiet pattern
 repeats,
  [  # but don't require the exact same
 frames
   volume(-40db)$2.length*.95,$2.length*1.05 
   volume(-15db)$3.length*.95,$3.length*1.05
  ]{3}
 
   /
 

You're just about there. Only the syntax needs work.

http://dev.perl.org/perl6/exegesis/5

 The point I'm trying to make:
 A regexp is already able to consume diffent kinds of characters from
 a
 string- :u0, :u1, :u2, :u3- and with RFC93 it can be fed anything a
 sub
 can return.  Those things can be characters- or strings- or
 stringified if
 the regexp requires- but if the regexp doesn't have any strings to
 match
 against, don't bother. Let the assertions get the atoms raw.
 
 Plenty of brilliance on this list, I know I'm not brilliant,
 especially
 when drowsy... did some research before posting but if this has been
 covered already (or is completely daft) please face me in the right
 direction and shoo me along gently.

What I think you're looking for is the fact that they're not regexes
any more. They are rexen, but in horrifying-secret-reality, what has
happened is that Larry's decided to move Fortran out of core, and
replace it with yacc. 

It's funny, but I try to describe this to people (gently) and they
immediately fall into three classes:

People who never got it, regex-wise, just kind of screw up their faces
and say Huh?

People (a very small number) who go Oh. Cool! and their eyes light
up.

And finally the majority of coders, who look as though they opened a
door expecting to 

Re: Ruminating RFC 93- alphabet-blind pattern matching

2003-04-02 Thread Uri Guttman
 AH == Austin Hastings [EMAIL PROTECTED] writes:

  AH grammar Rainbow;

  AH rule color {...};  # this one's on you.

  AH rule same_color($color is Colorific)
  AH {
  AH   color ::: { fail unless $1.looks_like($color); }
  AH }

  AH rule band($color is Colorific)
  AH {
  AH   same_color($color)+
  AH }

  AH rule Rainbow
  AH {
  AH   band(new Color(red))
  AH   band(new Color(orange))
  AH   band(new Color(yellow))
  AH   band(new Color(green))
  AH   band(new Color(blue))
  AH   band(new Color(indigo))
  AH   band(new Color(violet))
  AH   pot_o_gold?
  AH }

for the p6 regex impaired among us, please explain that. it might make a
nice tute for the docs. i get the general picture but i don't follow how
it works regarding the color checking.

uri

-- 
Uri Guttman  --  [EMAIL PROTECTED]   http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs    http://jobs.perl.org


Re: Ruminating RFC 93- alphabet-blind pattern matching

2003-04-02 Thread Austin Hastings

I'm reordering this post rather than retype stuff. Forgive me.

--- Uri Guttman [EMAIL PROTECTED] wrote:

 for the p6 regex impaired among us, please explain that. it might
 make a nice tute for the docs. i get the general picture but i don't
 follow how it works regarding the color checking.

Of course, the color checking is the part that I messed up. See below.

  AH == Austin Hastings [EMAIL PROTECTED] writes:
 

Disclaimer: I'm *SO* clueless about this stuff...

   AH grammar Rainbow;
 
   AH rule color {...};  # this one's on you.

Yary posited a color class, so I accept that he can recognize them. I
called it Colorific, just because. So my first mistake was probably
failure-to-declare:

grammar Rainbow;
use Colorific;  # Import Crule color; and Cnew, among others.

What I don't know is how to recognize a color, which is to say I don't
know how to write the color rule -- because I don't know what this is
being applied to. Is this reading pixels, interpreting the results of
radio telescopy, or consuming Lucky Charms breakfast cereal bits? I
don't know, so I'm just going to assume that Yary can write that for me
-- it's his class, after all.

And the Colorific class supposedly has a way to determine if two colors
look about like each other. Again, I don't know how that works, but I
don't need to.

   AH rule same_color($color is Colorific)
   AH {
   AH   color ::: { fail unless $1.looks_like($color); }
   AH }

This is really probably bad code. Maybe a better rule would be:

rule same_color($color is Colorific)
{
  color ::: { fail unless $color.looks_like($1); }
}

I KNOW that $color is an object-of-type-Colorific, while I'm not sure,
frankly, what color is returning. Let Colorific handle that.

Also, please note that the only reason that Csame_color is a separate
rule is because I haven't learned, yet, how to do that in one amazing
line of P6 code. I suspect that it could have been written all inside
the Cband rule, if I were smarter.

   AH rule band($color is Colorific)
   AH {
   AH   same_color($color)+
   AH }

Here, I'm just saying that a 'band' of the rainbow is made up of
at-least-one-maybe-more bit of a color that looks like $color. 

So, for a color band of, say, red, I'll take red and require that
there be a bunch of stuff that looks like red, using the same_color
rule (that, in turn, uses the Colorific::looks_like function, which
someone else wrote).

But that's the key: once I know how to recognize a band of color, I can
look up my old ROY G. BIV mnemonic from Astronomy (or what that
electronics? Too many dead brain cells:- I apologize if my rainbow is
actually a RadioShack-bow.)

So I need to declare what a Rainbow looks like. I'll use my band
shortcut to specify the seven different colors.

Note that rexen / rules are declarative, not imperative. The lines are
each pattern-invocations, and there aren't any semicolons. If I want
procedural, I need to open a sub-block to drop into perl command
mode, like I did in the Csame_color rule, above.

Since there's no alternation(|) or grouping or anything here, these
declarations are straight-line: they all apply, one after the other.

So a Rainbow is recognized when there's a
band-of-red followed immediately by a 
band-of-orange followed immediately by a 
...
band-of-violet. 
Then, if you're lucky, there might be a pot-o-gold at the end. :-)

rule pot_o_gold {
  $lucky := leprechaun   # You *DO* know how to catch a Leprechaun,
   # don't you?
  { 
if (trick($lucky)) {
   print Begorrah! I'm rich!;
} else {
   print Always after me Lucky Charms ...;
}
  }
}

}
   AH rule Rainbow
   AH {
   AH   band(new Color(red))
   AH   band(new Color(orange))
   AH   band(new Color(yellow))
   AH   band(new Color(green))
   AH   band(new Color(blue))
   AH   band(new Color(indigo))
   AH   band(new Color(violet))
   AH   pot_o_gold?
   AH }

This really is bad code on my part. s/Color/Colorific/, please. 

rule Rainbow
{
   band(new Colorific(red))
   band(new Colorific(orange))
   band(new Colorific(yellow))
   band(new Colorific(green))
   band(new Colorific(blue))
   band(new Colorific(indigo))
   band(new Colorific(violet))
   pot_o_gold?
}

 
 for the p6 regex impaired among us, please explain that. it might
 make a
 nice tute for the docs. i get the general picture but i don't follow
 how
 it works regarding the color checking.




Re: Ruminating RFC 93- alphabet-blind pattern matching

2003-04-02 Thread Yary Hluchan
Thanks for the thoughtful consideration.  Austin's given some high-
level examples of the kind I was hoping for,

AH = Austin Hastings

AH grammar Rainbow;
AH use Colorific;  # Import Crule color; and Cnew, among others.
AH 
AH What I don't know is how to recognize a color, which is to say I don't
AH know how to write the color rule -- because I don't know what this is
AH being applied to. Is this reading pixels, interpreting the results of
AH radio telescopy, or consuming Lucky Charms breakfast cereal bits? I
AH don't know, so I'm just going to assume that Yary can write that for me
AH -- it's his class, after all.

Right, encapsulation  public interface are the keys- rexen don't need
to know what makes Colorific. (And yes, I am a he, unlike Yari from Tron.)
I am curious about

AH rule color {...};  # this one's on you.

if Colorific doesn't have stringification- that's the crux: passing
non-letter atoms to the regex engine.  The way it's presented and
used, it's a rule that matches a color object, and seeing it in the
same_color rule is terrific- but (via RFC93?) I want to write it thus:

 rule color { (.) ( $1.isa(Colorific) ) }
 $daylight = peek_at_sky =~ /color/; # is something in sky Colorific?

This example could be written with grep- but then, T(always)MTOWTDI.

Bonus points for the implementation of grammar Rainbow, very cute!
Lucky strike is also clearly written, though, I was hoping to do away
with any mention of \d.  I want to grab numbers as atoms and never
enter the character realm.


AHWhat I think you're looking for is the fact that they're not regexes
AHany more. They are rexen, but in horrifying-secret-reality, what has
AHhappened is that Larry's decided to move Fortran out of core, and
AHreplace it with yacc.

Cool, I did quite like yacc when I needed it- and it does look like we
have that expressive power now!  Never used Fortran but I did spend a
couple summers in RPG-2, good riddence to big iron...

-y

~

The Moon is Waxing Crescent (1% of Full)


Re: Ruminating RFC 93- alphabet-blind pattern matching

2003-04-02 Thread Andrew Wilson
On Wed, Apr 02, 2003 at 10:16:37AM -0800, Austin Hastings wrote:
 And the Colorific class supposedly has a way to determine if two colors
 look about like each other. Again, I don't know how that works, but I
 don't need to.
 
   AH rule same_color($color is Colorific)
   AH {
   AH   color ::: { fail unless $1.looks_like($color); }
   AH }
 
 This is really probably bad code. Maybe a better rule would be:
 
 rule same_color($color is Colorific)
 {
   color ::: { fail unless $color.looks_like($1); }
 }
 
 I KNOW that $color is an object-of-type-Colorific, while I'm not sure,
 frankly, what color is returning. Let Colorific handle that.

It's my understanding (such as it is) of regexen that subrules called
via rule capture their result in hypothetical variables.  In
same_color, by the time you get into the code after the :::, $color
contains what was matched by color. So, if color matched at all, I
don't think you can call looks_like on $color because it's the
hypothetical result of color not a Colorific.  Either that or it fails
because you said it was a Colorific and it's not.  Or you tried to
assign to it but you can't because it's not Cis rw.

I think we need a P6 regexen engine to play with to get used to all this
new stuff properly :-)  Oh, and I really, really don't like all this
extraneous type information that everybody seems to be sprinkling around
their Perl6 code.

andrew
-- 
Virgo: (Aug. 23 - Sept. 22)
It seems the danger is over for now, but something tells you that you
haven't seen the last of that dastardly villain.


Re: Ruminating RFC 93- alphabet-blind pattern matching

2003-04-02 Thread Yary Hluchan
W= Andrew Wilson, AH=Austin Hastings

AH This is really probably bad code. Maybe a better rule would be:
AH
AH rule same_color($color is Colorific)
AH {
AH   color ::: { fail unless $color.looks_like($1); }
AH }
AH
AH I KNOW that $color is an object-of-type-Colorific, while I'm not sure,
AH frankly, what color is returning. Let Colorific handle that.

WIt's my understanding (such as it is) of regexen that subrules called
Wvia rule capture their result in hypothetical variables.  In
Wsame_color, by the time you get into the code after the :::, $color
Wcontains what was matched by color. So, if color matched at all, I
Wdon't think you can call looks_like on $color because it's the
Whypothetical result of color not a Colorific.  Either that or it fails
Wbecause you said it was a Colorific and it's not.  Or you tried to
Wassign to it but you can't because it's not Cis rw.

So long as the regexp is grabbing unicode. I posit a modifier:

 rule color { :ref (.) ( $1.isa(Colorific) ) }
 $daylight = peek_at_sky =~ /color/;

where :ref tells the engine that each atom is a reference, not
unicode.  Then what matches is still Colorific.

On-the-side syntax question- what happens to modifiers that take arguments
when they're inside the rule? like from A5 s:myoption($x) /foo/bar/,
can that be written s/:myoption($x) foo/bar/ ?  Wondering what happens
if the ref modifier can take an argument, saying what it's a ref of...

-y

~

The Moon is Waxing Crescent (1% of Full)


Re: Ruminating RFC 93- alphabet-blind pattern matching

2003-04-02 Thread Austin Hastings

--- Andrew Wilson [EMAIL PROTECTED] wrote:
 On Wed, Apr 02, 2003 at 10:16:37AM -0800, Austin Hastings wrote:
  And the Colorific class supposedly has a way to determine if two
 colors
  look about like each other. Again, I don't know how that works, but
 I
  don't need to.
  
AH rule same_color($color is Colorific)
AH {
AH   color ::: { fail unless $1.looks_like($color); }
AH }
  
  This is really probably bad code. Maybe a better rule would be:
  
  rule same_color($color is Colorific)
  {
color ::: { fail unless $color.looks_like($1); }
  }
  
  I KNOW that $color is an object-of-type-Colorific, while I'm not
 sure,
  frankly, what color is returning. Let Colorific handle that.
 
 It's my understanding (such as it is) of regexen that subrules called
 via rule capture their result in hypothetical variables.  In
 same_color, by the time you get into the code after the :::, $color
 contains what was matched by color. So, if color matched at all,
 I don't think you can call looks_like on $color because it's the
 hypothetical result of color not a Colorific.  Either that or it
 fails because you said it was a Colorific and it's not.  Or you tried
 to assign to it but you can't because it's not Cis rw.

More bad code on my part.

I had not intended that there be any correlation between the Ccolor
rule result and the C$color named argument -- that's just
coincidental evidence of my poor design skills.

If you'll replace $color with $c, and try again?

rule same_color($c is Colorific)
{
  color ::: { fail unless $c.looks_like($1); }
}

The intention here is that the result of Ccolor is actually bound
to $1, per E5. So the idea is that CColorific $c is asked to run its
.looks_like method on the Ccolor that was just recognized.


 I think we need a P6 regexen engine to play with to get used to all
 this new stuff properly :-)  Oh, and I really, really don't like all 
 this extraneous type information that everybody seems to be 
 sprinkling around their Perl6 code.

I know someone's working on P6::Rexen or some such (NOT regex, per LW).


As for the type info, I think that the compiler won't be able to do too
much with the low level of information that we're specifying here, so
it's mostly codereader hints. But Damian has suggested that using types
at all opens the yawning void of contamination up the call tree, so
your implicit preference for no-type-info may wind up being the norm.

=Austin




Re: Ruminating RFC 93- alphabet-blind pattern matching

2003-04-02 Thread Joseph F. Ryan
Austin Hastings wrote:

Another example.  Let's say there's a class that deals with colors.
It has an operator that returns true if two colors look about the 
same. Given a list of color objects, is there a regexp to find a
rainbow? Even if the color class doesn't support stringification? 
   

Yes.

grammar Rainbow;

rule color {...};  # this one's on you.

rule same_color($color is Colorific)
{
 color ::: { fail unless $1.looks_like($color); }
}
rule band($color is Colorific)
{
 same_color($color)+
}
rule Rainbow
{
 band(new Color(red))
 band(new Color(orange))
 band(new Color(yellow))
 band(new Color(green))
 band(new Color(blue))
 band(new Color(indigo))
 band(new Color(violet))
 pot_o_gold?
}
 



I'm a bit confused by the Csame_color rule; specifically, this line:

   $1.looks_like($color)

Shouldn't this be: C $color.looks_like($1)  ?  Otherwise, it
suggests that you're redefining the match object class, which
probably isn't a good idea.
Joseph F. Ryan
[EMAIL PROTECTED]


Re: Ruminating RFC 93- alphabet-blind pattern matching

2003-04-02 Thread Austin Hastings

--- Yary Hluchan [EMAIL PROTECTED] wrote:
 Thanks for the thoughtful consideration.  Austin's given some high-
 level examples of the kind I was hoping for,
 
 AH = Austin Hastings
 
 AH grammar Rainbow;
 AH use Colorific;  # Import Crule color; and Cnew, among others.
 AH 
 AH What I don't know is how to recognize a color, which is to say I
 don't
 AH know how to write the color rule -- because I don't know what
 this is
 AH being applied to. Is this reading pixels, interpreting the
 results of
 AH radio telescopy, or consuming Lucky Charms breakfast cereal bits?
 I
 AH don't know, so I'm just going to assume that Yary can write that
 for me
 AH -- it's his class, after all.
 
 Right, encapsulation  public interface are the keys- rexen don't
 need
 to know what makes Colorific. (And yes, I am a he, unlike Yari from
 Tron.)
 I am curious about
 
 AH rule color {...};  # this one's on you.
 
 if Colorific doesn't have stringification- that's the crux: passing
 non-letter atoms to the regex engine.  The way it's presented and
 used, it's a rule that matches a color object, and seeing it in the
 same_color rule is terrific- but (via RFC93?) I want to write it
 thus:
 

This isn't quite meaningful. What does a non-letter atom mean?

If you're processing a file or a string, that's the basic P6 model.

But consider \u for unicode -- that's a multi-byte object in the
stream.  So for streams of bytes, the right way is just to code Crule
color such that it recognizes them in whatever form -- stringified or
not.

On the other hand, let's suppose that you've got a vast array of
floating point data:

my float @seti = {...evidence of intelligence, somewhere...};

It's a fair question to ask how to retarget the rexengine to use @seti
as the input stream. (I hereby declare that if anyone ever writes a
grammar to do stock-picking, I thunk it first! :-)

I'm guessing that the right way is to replace the low-level operators,
but what are they?

  rule color { (.) ( $1.isa(Colorific) ) }
  $daylight = peek_at_sky =~ /color/; # is something in sky
 Colorific?

Alternatively, there may be a lower-level stream object that could be
replaced:

grammar Rainbow
{
  let Rex::get_one := read_float_from_array;

  # ...
}

 This example could be written with grep- but then, T(always)MTOWTDI.

Interesting, and possibly true. If you replaced all the interface-level
rules with code that interacted with some other data structure, it
might work:

rule color {
  fail unless some_condition($_);
  $0 = $_;
}

and then:

grep Rainbow::Rainbow, @sky_data;


But that's just wrong. I'll wager there's a trick we don't know yet
that will allow for processing arbitrary streams of data, no matter
what the source. The only question is whether we have to override
something on the grammer, override something on the regex, or implement
a fake IOstream class to feed to the rexengine.

 
 Bonus points for the implementation of grammar Rainbow, very cute!
 Lucky strike is also clearly written, though, I was hoping to do away
 with any mention of \d.  I want to grab numbers as atoms and never
 enter the character realm.

But what does that mean? Do you want standard patterns so that you
can talk about int patterns and just have them work, or do you want
to change your source from a character stream/string to something
else?

 
 AHWhat I think you're looking for is the fact that they're not
 regexes
 AHany more. They are rexen, but in horrifying-secret-reality, what
 has
 AHhappened is that Larry's decided to move Fortran out of core, and
 AHreplace it with yacc.
 
 Cool, I did quite like yacc when I needed it- and it does look like
 we
 have that expressive power now!  Never used Fortran but I did spend a
 couple summers in RPG-2, good riddence to big iron...

Well, formats have gone into a module, and the complex number stuff has
been relegated to the care of a more formal class structure. I was
pulling for COMMON data declarations, but that went to state. 

But at least the Parrot interpreter can use computed gotos... :-)

=Austin



Re: Ruminating RFC 93- alphabet-blind pattern matching

2003-04-02 Thread Yary Hluchan
This isn't quite meaningful. What does a non-letter atom mean?

If you're processing a file or a string, that's the basic P6 model.

But consider \u for unicode -- that's a multi-byte object in the
stream.  So for streams of bytes, the right way is just to code Crule
color such that it recognizes them in whatever form -- stringified or
not.

On the other hand, let's suppose that you've got a vast array of
floating point data:

my float @seti = {...evidence of intelligence, somewhere...};

It's a fair question to ask how to retarget the rexengine to use @seti
as the input stream.

I'm asking!  Array of float, int, Colorific, sound_frame, mixed bag
of objects- those are examples of non-letter atoms (what's it mean?
Exactly what I mean to.  Lewis Carol makes it so easy...)

Let's go back to A5. It defines modifiers for different levels of 
unicode- :u0, :u1, :u2, :u3 - which change what are considered the
atoms to grab and match.  Hence my suggestion for a :ref modifier
to tell the engine to grab references.  :scalar could be used to
grab floats in the seti example.

But what does that mean? Do you want standard patterns so that you
can talk about int patterns and just have them work, or do you want
to change your source from a character stream/string to something
else?

Something else.  In all my examples, I've been binding to a sub, so
that the input stream can be something other than character stream/
strings.

-y

~

The Moon is Waxing Crescent (2% of Full)


Ruminating RFC 93- alphabet-blind pattern matching

2003-04-01 Thread Yary Hluchan
A couple nights ago I read RFC93 as discussed in Apoc. 5 and got
fired up- it reminded me of some ideas from when I was hacking
Henry Spencer's regexp package. How to futher generalize regular
expression input.  It's a bit orthoginal- a properly implemented
RFC93 make some difficult things easier- whether it's done as
binding to a sub, or as overloading =~, or whatever.

A very general description of a regular expression, is a program
that seeks a match within a string of letters.  In perl4 the string
of letters was a string of bytes, and in perl6 it's a string of
Unicode (most of the time).

It might as well be a string of *anythings*.  Binding a match against
a sub is a natural way to get the anythings you want to match.  Now,
I'm a newbie to perl6, so be patient with my hacked-up examples below.
They won't work in any language. And, for the first I tweaked RFC93:

  When the match is finished, the subroutine would be called one final
  time, and passed 1 arguments: a flag set to 1, and a list containing
  the unused elements

which I admit is a poor interface- but it lets me write:

  # Looking for luck- find a run of 3 numbers divisible by 7 or 13
  # sub numerology is simply an interface to an array of integers
  sub numerology { $#_ ? shift,unshift @::nums,@_ : splice @::nums,0,@_ }
  numerology =~ / ( !($_[0] % 7 and $_[0] % 13) )3 /;

True, it's easy to join integers with spaces and write an equivalent regexp
on the result- but why stringify when you don't have to?

I'm running into trouble here- using ( code ) to match against a single
atom (a number), it should be more character classy.  Assertions are
flexible enough to match all sorts of non-letter atoms, can write a grammer
to make it more readable- maybe something like
  numerology =~ /  divisible(7)divisible(13) 3 /;

Another example.  Let's say there's a class that deals with colors. It has
an operator that returns true if two colors look about the same. Given
a list of color objects, is there a regexp to find a rainbow? Even if the
color class doesn't support stringification? 

A less fanciful example- scan a sound. A very crude beat-finding regexp- 
 fetch_sound_frames =~
  / (   # store soundclip (array of frames) in $1
 (volume(-40db)50,1500) # quietish section, 50-1500 frames
 (volume(-15db)+) # Followed by some loud frame(s)
)   # End capture of the first beat

before # Make sure the loud/quiet pattern repeats,
 [  # but don't require the exact same frames
  volume(-40db)$2.length*.95,$2.length*1.05 
  volume(-15db)$3.length*.95,$3.length*1.05
 ]{3}

  /

The point I'm trying to make:
A regexp is already able to consume diffent kinds of characters from a
string- :u0, :u1, :u2, :u3- and with RFC93 it can be fed anything a sub
can return.  Those things can be characters- or strings- or stringified if
the regexp requires- but if the regexp doesn't have any strings to match
against, don't bother. Let the assertions get the atoms raw.

Plenty of brilliance on this list, I know I'm not brilliant, especially
when drowsy... did some research before posting but if this has been
covered already (or is completely daft) please face me in the right
direction and shoo me along gently.

-y

~

The Moon is New