If I want to parse a language that is sensitive to whitespace
indentation (e.g. Python, Haskell), how do I do it using P6 rules/grammars?
The way I'd usually handle it is to have a lexer that examines leading
whitespace and converts it into "indent" and "unindent" tokens. The
grammer can then use these tokens in the same way that it would any
other block-delimiter.
This requires a stateful lexer, because to work out the number of
"unindent" tokens on a line, it needs to know what the indentation
positions are. How would I write a P6 rule that defines <indent> and
<unindent> tokens? Alternatively (if a different approach is needed) how
would I use P6 to parse such a language?
- Parsing indent-sensitive languages Dave Whipp
-