Hello all, I'd like to announce the release of 'pe', a new PEG parser for Python. While Python has no shortage of PEG parsers already, I think pe has enough interesting features that it's worth mentioning here.
https://github.com/goodmami/pe It's "new" in that I've never properly announced it anywhere, but I've been developing and using it for several years. Here's a summary: * Low learning curve for users: the API is closely modeled after Python's re module and the metasyntax is a superset of Ford's original with only a few extensions * Two options for parser engines: (1) a pure-Python memoizing recursive-descent parser and (2) a compiled Cython (currently non-memoizing) state machine parser inspired by Medeiros and Ierusalimschy, 2008 and LPeg * It's fast (by Python standards): aggressive grammar optimizations and low overhead show it significantly faster than LALR (SLY, Lark) and PEG (Parsimonious, PyParsing) parsers in several tests (todo: more tests) I achieved the speed in the compiled Cython parser rather conventionally: static typing, reducing function calls, etc., along with some basic grammar optimizations and some state machine optimizations from Medeiros and Ierusalimschy. In the pure-Python parser, I sped things up primarily by replacing large portions of the grammar with equivalent regular expressions, for which Python has a backend written in C. However, constantly switching between Python and C code adds overhead and can actually make things *slower*, so I put as much into a single regex as I can by first inlining everything in the grammar that isn't recursive and doesn't have a semantic effect on the expression (captures, etc.). In addition, to replicate PEG behavior in a regular expression, I avoided backtracking with a combination of lookaheads and backreferences (I think there's an academic citation for this technique, but I'm sorry I don't have it now). The resulting regexes (and optimized grammar) are utterly illegible, but they perform well. Finally, I'm not concerned with drastically changing the AST because I don't produce ASTs: any grammar transformation that accepts the same strings and has the same semantic effect is fair game. Thanks for reading. I welcome any comments or questions. -- -Michael Wayne Goodman
_______________________________________________ PEG mailing list PEG@lists.csail.mit.edu https://lists.csail.mit.edu/mailman/listinfo/peg