On Saturday, 7 July 2012 at 16:27:00 UTC, Philippe Sigaud wrote:
I added dstrings because

1- at the time (a few months ago), the lists here were awash in UTF-32
discussions and I thought that'd be the way to go anyway
2- other D parsing libraries seemed to go to UTF32 also (CTPG)
3- I wanted to be able to parse mathematical notation like nabla,
derivatives, etc. which all have UTF32 symbols.

I propose to switch code to use S if(isSomeString!S) everywhere. Client code would first determine source encoding scheme, and then instantiate parsers specifying a string type. This is not a trivial change, but I'm willing to help implementing it.

Note that PEG does not impose to use packrat parsing, even though it was developed to use it. I think it's a historical 'accident' that put the two together: Bryan Ford thesis used the two together.

Note that many PEG parsers do not rely on packrat (Pegged does not).
There are a bunch of articles on Bryan Ford's website by a guy
writting a PEG parser for Java, and who found that storing the last rules was enought to get a slight speed improvement, buth that doing anymore sotrage was detrimental to the parser's overall efficiency.

That's great! Anyway I want to understand the advantages and limitations of both Pegged and ANTLR, and probably study some more techniques. Such research consumes a lot of time but can be done incrementally along with development.

Reply via email to