On Tue, 28 Feb 2012 08:59:21 +0100, Andrei Alexandrescu
<seewebsiteforem...@erdani.org> wrote:
I'm starting a new thread on this because I think the matter is of
strategic importance.
We all felt for a long time that there's a lot of potential in CTFE, and
potential applications have been discussed more than a few times,
ranging from formatting strings parsed to DSLs and parser generators.
Such feats are now approaching fruition because a number of factors
converge:
* Dmitry Olshansky's regex library (now in Phobos) generates efficient D
code straight from regexen.
* The scope and quality of CTFE has improved enormously, making more
advanced uses possible and even relatively easy (thanks Don!)
* Hisayuki Mima implemented a parser generator in only 3000 lines of
code (sadly, no comments or documentation yet :o))
* With the occasion of that announcement we also find out Philippe
Sigaud has already a competing design and implementation of a parser
generator.
This is the kind of stuff I've had an eye on for the longest time. I'm
saying it's of strategic importance because CTFE technology, though not
new and already available with some languages, has unique powers when
combined with other features of D. With CTFE we get to do things that
are quite literally impossible to do in other languages.
We need to have a easy-to-use, complete, seamless, and efficient
lexer-parser generator combo in Phobos, pronto. The lexer itself could
use a character-level PEG or a classic automaton, and emit tokens for
consumption by a parser generator. The two should work in perfect tandem
(no need for glue code). At the end of the day, defining a complete
lexer+parser combo for a language should be just a few lines longer than
the textual representation of the grammar itself.
What do you all think? Let's get this project off the ground!
Thanks,
Andrei
I wrote a generic lexer generator some time ago.
It already let to some compiler O(N^2) optimizations, because the token
declarations sneak into the mangling :(.
I also finally added a workaround for a remaining CTFE bug (#6815).
https://gist.github.com/1255439 - lexer generator
https://gist.github.com/1262321 - complete and fast D lexer
I've ditched an attempt to write a parser combinator. It was based on
expression templates and ended up at spirit craziness.
<PERSONAL OPINION
The hassle of providing good error messages and synthesizing parse results
in a generic parser outweigh the benefit of a declarative grammar.
/PERSONAL OPINION>
A lot becomes feasible from the CTFE perspective,
despite some bugfixes I only miss exp and log currently.
I do not agree that it's the right moment to write a parser though.
It hits the first of phobos two biggest shortcomings, the lack of a good
I/O
system and the missing Allocators.
Any parser written now will either risk to not play nice with ranges
or has to come up with it's own buffering again.