Alan Manuel Gloria: > Actually, I think a tokenizer process will allow us to use a simple > parser combinator library, which means we don't need to worry about > calling protocols of productions on Scheme:
As an approach to clarifying the *specification*, particularly for indentation and skipping ;-only lines, that might be sensible. But I also want to make it obvious that an *implementation* does not have to have a separate tokenizer process. The "read" procedure is a very basic to any Lisp. Implementors may avoid a notation that seems to *depend* on having this for implementation. Thus, I think we need an implementation that does *not* need that. > everything uses the same > parser calling protocol, which of course can be based on Monads (^^); Hmm, I'm concerned that discussing Modads will send 1/4 of our audience running to the hills :-). > So using a separate tokenizer is clearer IMO. > > The only drawback is that we now need to use SAME. I don't think that's true. The ANTLR implementation simply consumes same-indents and doesn't generate any tokens. As long as a tokenizer can just consume a character sequence, generate nothing, and then consume more characters to finally *get* a token, it seems to me it should be fine. Of course, I've been wrong before; if I *am* wrong, I'm curious as to why. > A precis: the tokenizer is not a separate pass, but rather implemented > as a stateful procedure that will consume exactly one token on the > input stream. This allows laziness, which allows us to leave as many > characters as possible on the port at any one time. The ANTLR > architecture of having the tokenizer call a stateful parser procedure > is also possible, but I think it's easier to have the (more complex) > parser call the tokenizer than the reverse. (Nitpick: ANTLR calls a stateful lexer, not a parser.) Maybe. Hard to know without comparing. > And I think we should also formalize the tokenizer, since behavior > like "comment-only lines are skipped" is NOT explicitly shown in the BNF. Okay!! This is certainly sensible. What worries me is that this is the kind of thing that was easy to fully describe in English, yet can be tricky to correctly formalize. I wasn't trying to be snarky about my comment "look at the ANTLR process"; it turned out to take several tries before I got a clean and at-least-appears-to-be-correct implementation. Granted, the inability of the parser to influence the lexer in ANTLR made it a little more work; a different approach avoids that issue completely. E.G., a traditional recursive descent parser doesn't have that limitation at all. > Formalizing the tokenizer also allows us to strip away the hspace's in > the t-expression parsing spec. I'm very leary of removing the hspace's from the parsing spec. SRFI-49 did that; the resulting BNF was certainly simpler, but it made it *much* more difficult to *correctly* implement the spec. If that all moves into the tokenizer, I'm concerned that it may not be obvious where it happens, especially for people implementing it using traditional recursive descent parsing approaches. I want people to be able to implement code that is "obviously correct"; if the spec is rigged so that implementation is mostly 1-to-1 it'll be easier to accept. > The overall sweet-reader specifications is split into > three components:... Gotta run, family commitments, I'll take a real look later. In the end, though, I suspect having several implementation trials is a good thing. If nothing else, it'll prove that the specification is easy-enough to implement several ways. --- David A. Wheeler ------------------------------------------------------------------------------ Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft MVPs and experts. ON SALE this month only -- learn more at: http://p.sf.net/sfu/learnnow-d2d _______________________________________________ Readable-discuss mailing list Readable-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/readable-discuss