Hi Folks,
I was interested in generating parsers in scheme so I has a look at (system
base lalr). I was not crazy about the syntax: it's not really scheme-like and
did not support in-rule actions (e.g., "foo { $$ = $3 + $2; } bar"). In
addition, the code is, according to its author, a direct c-to-scheme
translation from the Bison source, and makes heavy use of defmacro.
So, just for fun, I have started coding my own lalr parser generator. (Many
years ago I read the dragon book cover-to-cover and worked on my own yacc in
C.) Here is an example illustrating the basic syntax I have designed for a
parser specification:
(lalr1-spec
(token integer float)
(start expr)
(grammar
(expr
(expr #\+ term (+ $1 $3))
(expr #\- term (- $1 $3)))
)
(term
(term #\* factor (* $1 $3)))
(term #\/ factor (/ $1 $3)))
)
(factor (integer) (float))
)))
Note that right-hand-side elements enclosed in paren's are interpreted as
actions. Not sure this will stay.
Now in the code I will need to be going through the productions checking if
symbols are terminals (i.e., declared with "token"). This could end up being
inefficient. In order to make the code more efficient I am considering using
keywords (e.g., test for terminal with "keyword?"). However, I wonder if using
keywords instead of "token" declarations would be considered "bad form." For
example, in the above, replace
(token integer float)
...
(factor (integer) (float))
with
(factor (#:integer) (#:float))
Comments?
Matt