On Sun, Sep 7, 2014 at 12:12 PM, David Jeske <[email protected]> wrote:

> If the objection is to a particular imperfection in a context-free
> parser/lexer connection, there are plenty of other designs with different
> imperfections to choose from (see list below).
>
There wasn't a coherent objection. Here are the concerns/issues:

1. I want to stay within a grammar formalism that can test for (and object
to) ambiguous grammars. That pretty much restricts us to LR(k), LL(k), and
LALR(k). While I'm a great fan of Terence Parr's work with Antlr, it has
become too sophisticated for our purposes.

2. I'd like to stay within the LL(k) family, because *all* production front
ends eventually move to hand-written parsers for the sake of better error
diagnostics, and LL(k) is where you want to be coming from when you do that.

3. I had a vague hope that if the scanner got merged into the parser I
could drop the work on RE processing and simply use an LL(1) parser
generator (which is essentially done) until something pushed us to LL(k).
That was the main motivation behind asking whether the phase separation was
useful.

In the end, I think the phase separation is useful for a pragmatic reason:
it isolates tokenization from unintended parser-driven context dependency.
That's good because any unintended context dependency in the tokenizer has
the consequence of tying us to parse algorithms that supply the necessary
context information. For the moment I'd prefer to stay agnostic on that.


One thing that I *could* do at this point is to defer automatic tokenizer
synthesis and just produce token numbers. We could do a hand-written
tokenizer for now, and then someone other than me could back-fill the
tokenizer synthesis part. It turns out that tokenization is a little
different from vanilla RE processing.


If we happen to have someone who wants to step up and take over the
tokenizer problem, and is willing to donate their code back, I think I'm
now at the point where I can say what API I think I want (subject to
improvement on review, of course). That would free me to work on the parser
generator. I should be able to build a front end to exercise the API very
quickly, which would give us a test path.

Anybody feel like collaborating on that? A bunch of the work is done, but
the part that remains is actually kind of interesting...


shap
_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Reply via email to