Re: [bitc-dev] Why is shap farting around with parser generators?

Jonathan S. Shapiro Sat, 06 Sep 2014 11:22:58 -0700

On Fri, Sep 5, 2014 at 6:31 PM, Ben Kloosterman <[email protected]> wrote:


> IMHO a better test would be write it ASAP get the language out , then see
> how hard it is to re-factor when you self host. I bet that will change the
> whole way its structured and force language changes to make it easier.
>

How would you go about that? Remember that we've already done a C++-hosted
version of this compiler using YACC.


> "*ad hoc* parse rules and/or parse ambiguities." are common in nearly all
> languages - with good reason they have taken short cuts to get something
> out the door.
>

For many languages that is so. *None* of those languages are languages that
anyone has attempted to do regular formal reasoning about successfully.


> Javascript and Linux are  a great examples -our world is filled with
> second rate products because the better ones never get finished.
>

Agreed. And if their approach had led to sufficiently survivable systems,
we wouldn't be attempting to do BitC at all. But it didn't, did it?


> The whole purpose of modern languages / techniques is to be able to
> re-factor and improve it...
>

That's just not true. But it *is* true that incremental re-factoring and
improvement in place of disciplined design has been the fad in computing
for the last ten years. That has happened for reasons of economic and
competitive pressure rather than technical merit. It has left us with
systems that are not just insecure, but *not securable in principal*. BitC
is a step on the path to changing that. It's goals can't be met by making
it up as we go along.


> ( which is the main reason the parser and tokeniser are split - while its
> worse / less efficient than a combined custom implementation it does allow
> better / simpler code ).
>

Can you cite a source for that (fairly implausible) assertion?

Historically, both parts were done by hand, and they *were* fused. Parser
generators came about because maintaining parsers by hand is too hard to
allow reasonable maintenance or refactoring. Lexer generators came along
with the same idea, but didn't prove to add enough value in practice to
justify themselves.

The usual evolution of these things is that languages use a parser
generator until the petrify, and then shift to a recursive descent
hand-written parser for the sake of better error handling. This is actually
why I'm doing an LL grammar rather than an LR grammar; the LL grammar can
emit a recursive descent parser directly.


One of the features that I initially resisted in BitC was layout. It turns
out that layout is useful, but its implementation in most languages is a
*mess*. Most of the popular layout schemes violate the parser/tokenizer
phase boundary. The most notorious of these is the Haskell reliance on
parse-error(t) in the L function (L, incidentally, may be the most opaque
specification I've ever seen).


Michael Adams actually makes the point very well with regard to parsing of
layout:

The lack of a standard formalism for expressing these layout rules and of
parser generators for such a formalism increases the complexity of writing
parsers for these languages. Often, practical parsers for these languages
have significant structural differences from the language specification..
[T]he structural differences between the implementation and the
specification make it difficult to determine if one accurately reflects the
other.


That statement could equally well be made of most modern language grammar
specifications. C++ grammars, for example, are plagued with ambiguities and
phase boundary violations that *commonly* lead to subtle disagreements
between implementations, and it is not really possible to know from the
standard which implementation is correct.


Perhaps you have the impression that the parser generator is what has been
slowing me down. Actually, that's not true. There are a lot of other things
going on here, and I'm not getting to spend a whole lot of time on BitC. I
make progress as I can.


Jonathan

_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Re: [bitc-dev] Why is shap farting around with parser generators?

Reply via email to