> Chances are that I will be able to get the original GLL parser generator > from one of algorithm authors (Adrian Johnstone). He's really helpful here. > From that point, I will only have to add a frontend for generating a > concrete parser, starting with Python - it already has a fully working > grammar. Hopefully, I will also be able to find a suitable grammar for D, it > is always a pleasure to play with the language. > > Otherwise, I will continue my current effort - implementing the GLL parser > generator in D.
I did that during this summer, almost to the point it was self-sustained (that is, the GLL parser generator was used to generate a parser for grammars files). I chose a range interface: input is a string, the parse forest is output as a range of parse trees. I loved to see it generate the different possible parse trees on ambiguous grammars, and accepting left- and right-recursing grammars! GLL is a very noce algorithm. Halas, even with some shortcuts on Kleene stars it was quite slow. I tried to use threads (spawning worker threads on alternatives), but that did not change the timings much. I could make it generate a parser for JSON and compared it to the jsonx module that Sönke presented here. Bah, it was 1000 times slower (which is all relative: it still takes only a few ms to parse a big JSON file. But still, far away from the microseconds it should need). Pegged was faster that this GLL generator, but of course still far slower than jsonx. [Of course, Pegged can cheat: it can parse your file at compile-time, resulting in 0-µs parsing time at runtime :-)] Also, the way I coded it I hit CTFE limits and could not have it parse at compile-time. A shame, really. > This is one of the reasons I prefer LL/RD parsers :-) They are easy to > follow. I would like to try a LALR compile-time generator and compile-time parser. I'm pretty sure LALR tables could be expanded directly as D code instead of being read during parsing. Philippe