Hello all,

I'd like to announce the release of 'pe', a new PEG parser for Python.
While Python has no shortage of PEG parsers already, I think pe has enough
interesting features that it's worth mentioning here.

https://github.com/goodmami/pe

It's "new" in that I've never properly announced it anywhere, but I've been
developing and using it for several years. Here's a summary:

* Low learning curve for users: the API is closely modeled after Python's
re module and the metasyntax is a superset of Ford's original with only a
few extensions

* Two options for parser engines: (1) a pure-Python memoizing
recursive-descent parser and (2) a compiled Cython (currently
non-memoizing) state machine parser inspired by Medeiros and Ierusalimschy,
2008 and LPeg

* It's fast (by Python standards): aggressive grammar optimizations and low
overhead show it significantly faster than LALR (SLY, Lark) and PEG
(Parsimonious, PyParsing) parsers in several tests (todo: more tests)

I achieved the speed in the compiled Cython parser rather conventionally:
static typing, reducing function calls, etc., along with some basic grammar
optimizations and some state machine optimizations from Medeiros and
Ierusalimschy.

In the pure-Python parser, I sped things up primarily by replacing large
portions of the grammar with equivalent regular expressions, for which
Python has a backend written in C. However, constantly switching between
Python and C code adds overhead and can actually make things *slower*, so I
put as much into a single regex as I can by first inlining everything in
the grammar that isn't recursive and doesn't have a semantic effect on the
expression (captures, etc.). In addition, to replicate PEG behavior in a
regular expression, I avoided backtracking with a combination of lookaheads
and backreferences (I think there's an academic citation for this
technique, but I'm sorry I don't have it now). The resulting regexes (and
optimized grammar) are utterly illegible, but they perform well. Finally,
I'm not concerned with drastically changing the AST because I don't produce
ASTs: any grammar transformation that accepts the same strings and has the
same semantic effect is fair game.

Thanks for reading. I welcome any comments or questions.

-- 
-Michael Wayne Goodman
_______________________________________________
PEG mailing list
PEG@lists.csail.mit.edu
https://lists.csail.mit.edu/mailman/listinfo/peg

Reply via email to