On Jun 27, Steve Fink said: >On Jun-26, Jeff 'japhy' Pinyan wrote: >> I am currently completing work on an extensible regex-specific parsing >> module, Regexp::Parser. It should appear on CPAN by early July (hopefully >> under my *new* CPAN ID "JAPHY"). >> >> Once it is completed, I will be starting work on writing a subclass that >> matches Perl 6 regexes, Regexp::Perl6 (or Perl6::Regexp, or >> Perl6::Regexp::Parser). I think this might be of some use to the Perl 6 >> dev crew, but I'm not sure how. > >Sounds interesting, but I'm a bit confused about what it is. Clearly, >it parses regexes, but what is the output? A parse tree? Tables and/or >code that implement a matching engine for that regex? PIR? A training >regimen that can be used to condition a monkey to push a "yes" or "no" >button whenever you give it a banana with an input string inscribed on >it?
It creates a tree structure, not identical but similar to the array of nodes Perl uses internally. /a+?(bc|d|e)+/ is represented as [ MINMOD( PLUS( EXACT('a') ) ), PLUS( ALT( EXACT('bc'), ALT( EXACT('d'), EXACT('e'), ) ) ) ] Two signficant differences are that (?ismx-ismx) and (?ismx-ismx:...) assertions are found explicitly in the parse tree. To Perl, in-place flag changing doesn't need to stick around, because it has already had its effect. Same deal with (?:...), really; it only serves to separate part of the regex for a specific operation, and doesn't have a regex opcode. For my module, though, it's helpful to keep these things around. That might change before I release the module, though. I'm not sure yet. For something like /a(?:b|c)d/, the tree can be [ EXACT('a'), ALT(EXACT('b'), EXACT('c')), EXACT('d'), ] See, no need for an explicit NOCAPTURE node or something. And if it was /a(?i:b|c)d/, then I would do [ EXACT('a'), ALT(EXACTF('b'), EXACTF('c')), EXACT('d'), ] The trick is, when it comes time to create a physical regex from the tree, I'd need to create (?flag) things on the fly. Which probably isn't too hard. Either way, there *will* be (?flag) and (?flag:...) objects created so that the user knows of their existence during node-by-node parsing. They just might not get included in the tree. -- Jeff "japhy" Pinyan [EMAIL PROTECTED] http://www.pobox.com/~japhy/ RPI Acacia brother #734 http://www.perlmonks.org/ http://www.cpan.org/ CPAN ID: PINYAN [Need a programmer? If you like my work, let me know.] <stu> what does y/// stand for? <tenderpuss> why, yansliterate of course.