The ANTLR v3 book ("The Definitive ANTLR Reference") specifically discusses how to process indentation, which I thought was a good thing.
But now that I'm looking at it more carefully (page 95), I'm realizing that the book is just wrong. The book proposes triggering indent processing when there are 1-or-more indent characters on a line. That cannot possibly work; that would mean that DEDENTs would never be generated on blank lines or lines that do not begin with an indent character. Instead, in the ANTLR implementation, we should trigger as part of EOL processing. After processing an EOL, read in any following indentation, and emit INDENT/DEDENT from that. That fails to deal with indents at the beginning of a file, but we can detect & process that specially. It's basically what we do now. That requires that ANTLR be configured to allow multiple emits per lexical token, but that only required a few lines of code from its FAQ (which I've already added). That algorithm also fails to deal with EOF without a preceding EOL. We could deal with that as a special case too, though I'm inclined to just forbid it in the spec. Handling EOF withing a preceding EOL is ugly in general, and it's not the something you see in practice in source or structured data. --- David A. Wheeler ------------------------------------------------------------------------------ Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft MVPs and experts. ON SALE this month only -- learn more at: http://p.sf.net/sfu/learnmore_122712 _______________________________________________ Readable-discuss mailing list Readable-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/readable-discuss