A couple nights ago I read RFC93 as discussed in Apoc. 5 and got
fired up- it reminded me of some ideas from when I was hacking
Henry Spencer's regexp package. How to futher generalize regular
expression input.  It's a bit orthoginal- a properly implemented
RFC93 make some difficult things easier- whether it's done as
binding to a sub, or as overloading =~, or whatever.

A very general description of a regular expression, is a program
that seeks a match within a string of letters.  In perl4 the string
of letters was a string of bytes, and in perl6 it's a string of
Unicode (most of the time).

It might as well be a string of *anythings*.  Binding a match against
a sub is a natural way to get the anythings you want to match.  Now,
I'm a newbie to perl6, so be patient with my hacked-up examples below.
They won't work in any language. And, for the first I tweaked RFC93:

  When the match is finished, the subroutine would be called one final
  time, and passed >1 arguments: a flag set to 1, and a list containing
  the "unused" elements

which I admit is a poor interface- but it lets me write:

  # Looking for luck- find a run of 3 numbers divisible by 7 or 13
  # "sub numerology" is simply an interface to an array of integers
  sub numerology { $#_ ? shift,unshift @::nums,@_ : splice @::nums,0,@_ }
  &numerology =~ / <( !($_[0] % 7 and $_[0] % 13) )><3> /;

True, it's easy to join integers with spaces and write an equivalent regexp
on the result- but why stringify when you don't have to?

I'm running into trouble here- using <( code )> to match against a single
"atom" (a number), it should be more "character classy".  Assertions are
flexible enough to match all sorts of non-letter atoms, can write a grammer
to make it more readable- maybe something like
  &numerology =~ / < <divisible(7)><divisible(13)> ><3> /;

Another example.  Let's say there's a class that deals with colors. It has
an operator that returns true if two colors look about the same. Given
a list of color objects, is there a regexp to find a rainbow? Even if the
color class doesn't support stringification? 

A less fanciful example- scan a sound. A very crude beat-finding regexp- 
 &fetch_sound_frames =~
  / (                           # store soundclip (array of frames) in $1
     (<volume(-40db)><50,1500>) # quietish section, 50-1500 frames
     (<volume(-15db)>+)         # Followed by some loud frame(s)
    )                           # End capture of the first beat

    <before                     # Make sure the loud/quiet pattern repeats,
     [                          # but don't require the exact same frames
      <volume(-40db)><$2.length*.95,$2.length*1.05> 
      <volume(-15db)><$3.length*.95,$3.length*1.05>
     ]{3}
    >
  /

The point I'm trying to make:
A regexp is already able to consume diffent kinds of characters from a
string- :u0, :u1, :u2, :u3- and with RFC93 it can be fed anything a sub
can return.  Those things can be characters- or strings- or stringified if
the regexp requires- but if the regexp doesn't have any strings to match
against, don't bother. Let the assertions get the atoms raw.

Plenty of brilliance on this list, I know I'm not brilliant, especially
when drowsy... did some research before posting but if this has been
covered already (or is completely daft) please face me in the right
direction and shoo me along gently.

-y

~~~~~

The Moon is New

Reply via email to