----- Original Message -----
From: "Jonathan Scott Duff" <[EMAIL PROTECTED]>
Subject: Re: XML/HTML-specific ?< and ?> operators? (was Re: RFC 145
(alternate approach))


> How about qy() for Quote Yacc  :-)  This stuff is starting to look
> more and more like we're trying to fold lex and yacc into perl.  We
> already have lex through (?{code}) in REs, but we have to hand-write
> our own yacc-a-likes.

Though you can do cool stuff in (?{code}), I wouldn't quite call it lex.
First off we're dealing with NFA instead of DFA, and at the very least, that
gives you back-tracking.  True, local's allow you to preserve state to some
degree.  But the following is as close as I can consider (?{code}) a lexer:

sub lex_init {
my $str = shift;
our @tokens;
$str =~ / \G (?{ local @tokens; })
   (?: TokenDelim(\d+) (?{ push @tokens, [ 'digit', $1 ] })
       | TokenDelim(\w+) (?{ push @tokens, [ 'word', $1 ] })
   )
/gx;
}

sub getNextToken {  shift @tokens; }

I'm not even suggesting this is a good design.  Just showing how akward it
is.

Other problems with the lexing in perl is that you pretty much need the
entire string before you begin processing, while a good lexer only needs the
next character.  Ideally, this is a character stream.  Already we're talking
about a lot of alteration and work here..  Not something I'd be crazy about
putting into the core.

-Michael



Reply via email to