On Saturday, 7 July 2012 at 15:42:30 UTC, Chad J wrote:
On 07/05/2012 08:28 AM, Roman D. Boiko wrote:
Pegged may eventually become standard, if it will be performanceoptimized and a bit more customizable

Interesting, I thought you were hand-writing this stuff.

I'm a fan of pegged and made some pull requests, one of which was aimed at making it more customizable by allowing the user to define what type gets used as a parse tree node, thus allowing one to potentially use their parse tree as an AST (and maybe do a couple other things too). It's a WIP, but the proof of concept is done: DMD can handle the extra templating at compile time, so it works.

What kind of things did you want in terms of customizability?

There are many possible customization point which would make it a better fit for my project while also being useful in general.

The most critical was the one you implemented: ability to define a custom parse tree. I also needed the ability to use immutable structures (almost) everywhere, while currently they must be mutable. Also it would be great to avoid UTF conversions (instead use functions and types templated by the string type). I also wanted to add ability to reuse parts of previously generated AST which correspond to source code that has not been changed, or to identical source code parsed previously. (This would allow incremental parsing, e.g., while user is typing.) The latter would require to track source code positions separately, or even generating them on demand (this is the feature actively criticized by deadalnix in some previous discussions but central to DCT architecture). AST would only hold information about widths of its nodes.

I've also written some notes (10-15 suggestions) while studying Pegged code, which will be shared later.

However, as I found a few hours ago, Packrat parsing (typically used to handle PEG) has serious disadvantages: it complicates debugging because of frequent backtracking, it has problems with error recovery, and typically disallows to add actions with side effects (because of possibility of backtracking). These are important enough to reconsider my plans of using Pegged. I will try to analyze whether the issues are so fundamental that I (or somebody else) will have to create an ANTLR-like parser instead, or whether it is possible to introduce changes into Pegged that would fix these problems.

Reply via email to