Re: P6C: Parser Weirdness

Larry Wall Thu, 13 May 2004 12:22:24 -0700

On Thu, May 13, 2004 at 01:01:12PM -0500, Abhijit A. Mahabal wrote:
: I have been thinking the following about what larry said earlier. Is this
: what you meant, larry?
: 
: $grammar = q{
:       class_defn: "class" block .. etc (normal top-down stuff)
:       ...
:       term: { call Parse::Yapp or something }
: }


Er, I've never used Parse::Yapp, so I couldn't say.  I'd tend to roll
my own operator precedence parser anyway.

: What I mean by that weird pseudocode is that everything happens in a
: parse-recdescent parser, except that the production of some non-terminals
: calls into a bottom-up parser. That part seems trivially doable. What I
: need to figure out is how the bottom-up parser can relinquish control to
: the top-down parser if it hits a block or such.

The bottom up parser doesn't know that it is relinquishing control.  It
has to have some kind of tokener/lexer, and that lexer recognizes that it
has a complex term that needs a subparser, calls the parser, and then
returns the subparse as a single token to the bottom-up engine.

: A preliminary thought: A traditional bottom-up parser can take the four
: actions shift, reduce, accept, and error. We will need a fifth: call the
: top down parser. I have not thought through how that happens or if it can
: be done at all. It is the only way that I currently see.

No, you still have the four basic actions.  Subparsing is all hidden in
the lexer.  As far as the parse is concerned, it's processing a single
token that happens to be something complicated.  The only complicated
thing is making sure that all the constructs sub-parsed by the lexer
know when they should terminate.  That's trivially easy for bracketed
constructs like blocks.  Things get a little tricker for things like
unparenthesized list operators that have to be treated like a left
parenthesis that has no corresponding right parenthesis.  Things like
semicolon will be slightly context sensitive: outside brackets, they
terminate the current statement, while inside brackets, they separate
slice terms.

And, of course, the lexer has to be sensitive to whether we're currently
expecting a term or an operator...

: I think I'll try to coax Parse::Recdescent and Parse::Yapp to work
: together. I have never used the latter, but the documentation seems to
: suggest that it can do what Bison can. Or maybe I should write two small
: parsers by hand (one recdescent and the other an operator precedence
: shift/reduce) and make them be nice to each other.

I'd take the latter approach myself, since in any event it will
probably need tweaks that are foreign to whatever tool you choose.
In particular, the fact that Perl 6 uses string comparison rather than
numeric comparison to do precedence levels is going to give almost
any standard tool a hissyfit.  (We do that so that the user never has
to specify a precedence level--all levels are specified relative to
an existing operator's precedence level.  And with strings, we can
fit as many new precedence levels into the interstices as the user
likes without ever running out of numeric precision.)

Larry

Re: P6C: Parser Weirdness

Reply via email to