>> I did that during this summer, almost to the point it was >> self-sustained (that is, the GLL parser generator was used to generate >> a parser for grammars files). I chose a range interface: input is a >> string, the parse forest is output as a range of parse trees. > > > Nice! Is it public? Github?
No github repo. I could put it alongside Pegged, I suppose. I used git internally, though. I was a bit disheartened by the speed I got, so did not publish nor announced it here. Note also that I saw it as an alternative engine for my own Pegged project, so I used the same way of defining grammars (some prefixes ans suffixes for dropping nodes in the parse tree and so on). I can send you the code (well the entire repo), if you wish. I did not touch it for the past 3 months, so I already don't remember what state it was in :-(. Looking at the code now, it seems I'm still using Pegged to parse the initial grammar. Bootstrapping did not go well. Send me an email at firstname . lastname @gmail.com (philippe sigaud) >> Halas, even with some shortcuts on Kleene stars it was quite slow. I >> tried to use threads (spawning worker threads on alternatives), but >> that did not change the timings much. > > > AFAIK, multithreading is a bad idea in parsing. I think, as many things in CS, that people developed parsing algorithms before multicores existed. Spawning threads or fibers for some alternatives (A | B) seems nice to me. I got interesting results with a purely multithreaded parser that spawned children for each choice. > Actually, in the gll-combinators Scala project they have similar speed > problems. I don't if it's a fundamental algorithm problem or an > implementation details that lead to slowdowns. Well, when all resulting implementations have speed problems... I found an article by the GLL authors explaining how they coded their algorithm for more speed, but I couldn't wrap my head around it. Philippe