On Jun 27, Steve Fink said:

>On Jun-26, Jeff 'japhy' Pinyan wrote:
>> I am currently completing work on an extensible regex-specific parsing
>> module, Regexp::Parser.  It should appear on CPAN by early July (hopefully
>> under my *new* CPAN ID "JAPHY").
>>
>> Once it is completed, I will be starting work on writing a subclass that
>> matches Perl 6 regexes, Regexp::Perl6 (or Perl6::Regexp, or
>> Perl6::Regexp::Parser).  I think this might be of some use to the Perl 6
>> dev crew, but I'm not sure how.
>
>Sounds interesting, but I'm a bit confused about what it is. Clearly,
>it parses regexes, but what is the output? A parse tree? Tables and/or
>code that implement a matching engine for that regex? PIR? A training
>regimen that can be used to condition a monkey to push a "yes" or "no"
>button whenever you give it a banana with an input string inscribed on
>it?

It creates a tree structure, not identical but similar to the array of
nodes Perl uses internally.

  /a+?(bc|d|e)+/

is represented as

  [
    MINMOD(
      PLUS(
        EXACT('a')
      )
    ),
    PLUS(
      ALT(
        EXACT('bc'),
        ALT(
          EXACT('d'),
          EXACT('e'),
        )
      )
    )
  ]

Two signficant differences are that (?ismx-ismx) and (?ismx-ismx:...)
assertions are found explicitly in the parse tree.  To Perl, in-place flag
changing doesn't need to stick around, because it has already had its
effect.  Same deal with (?:...), really; it only serves to separate part
of the regex for a specific operation, and doesn't have a regex opcode.
For my module, though, it's helpful to keep these things around.

That might change before I release the module, though.  I'm not sure yet.
For something like /a(?:b|c)d/, the tree can be

  [
    EXACT('a'),
    ALT(EXACT('b'), EXACT('c')),
    EXACT('d'),
  ]

See, no need for an explicit NOCAPTURE node or something.  And if it was
/a(?i:b|c)d/, then I would do

  [
    EXACT('a'),
    ALT(EXACTF('b'), EXACTF('c')),
    EXACT('d'),
  ]

The trick is, when it comes time to create a physical regex from the tree,
I'd need to create (?flag) things on the fly.  Which probably isn't too
hard.

Either way, there *will* be (?flag) and (?flag:...) objects created so
that the user knows of their existence during node-by-node parsing.  They
just might not get included in the tree.

-- 
Jeff "japhy" Pinyan      [EMAIL PROTECTED]      http://www.pobox.com/~japhy/
RPI Acacia brother #734   http://www.perlmonks.org/   http://www.cpan.org/
CPAN ID: PINYAN    [Need a programmer?  If you like my work, let me know.]
<stu> what does y/// stand for?  <tenderpuss> why, yansliterate of course.


Reply via email to