Pegged, a Parsing Expression Grammar (PEG) generator in D

Philippe Sigaud Sat, 10 Mar 2012 15:30:06 -0800

Hello,

I created a new Github project, Pegged, a Parsing ExpressionGrammar (PEG) generator in D.


https://github.com/PhilippeSigaud/Pegged

docs: https://github.com/PhilippeSigaud/Pegged/wiki

PEG: http://en.wikipedia.org/wiki/Parsing_expression_grammar

The idea is to give the generator a PEG with the standard syntax.From this grammar definition, a set of related parsers will becreated, to be used at runtime or compile time.


Usage
-----

To use Pegged, just call the `grammar` function with a PEG andmix it in. For example:



import pegged.grammar;

mixin(grammar("
    Expr     <- Factor AddExpr*
    AddExpr  <- ('+'/'-') Factor
    Factor   <- Primary MulExpr*
    MulExpr  <- ('*'/'/') Primary
    Primary  <- Parens / Number / Variable / '-' Primary

    Parens   <- '(' Expr ')'
    Number   <~ [0-9]+
    Variable <- Identifier
"));

This creates the `Expr`, `AddExpr`, `Factor` (and so on) parsersfor basic arithmetic expressions with operator precedence ('*'and '/' bind stronger than '+' or '-'). `Identifier` is apre-defined parser recognizing your basic C-style identifier.Recursive or mutually recursive rules are OK (no left recursionfor now).

To use a parser, use the `.parse` method. It will return a parsetree containing the calls to the different rules:


// Parsing at compile-time:
enum parseTree1 = Expr.parse("1 + 2 - (3*x-5)*6");

pragma(msg, parseTree1.capture);
writeln(parseTree1);

// And at runtime too:
auto parseTree2 = Expr.parse(" 0 + 123 - 456 ");
assert(parseTree2.capture == ["0", "+", "123", "-", "456"]);



Features
--------

* The complete set of PEG operators are implemented

* Pegged can parse its input at compile time and generate acomplete parse tree at compile time. In a word: compile-timestring (read: D code) transformation and generation.

* You can parse at runtime also, you lucky you.

* Use a standard and readable PEG syntax as a DSL, not a bunch oftemplates that hide the parser in noise.* But you can use expression templates if you want, as parsersare all available as such. Pegged is implemented as an expressiontemplate, and what's good for the library writer is sure OK forthe user too.* Some useful additional operators are there too: a way todiscard matches (thus dumping them from the parse tree), to pushcaptures on a stack, to accept matches that are equal to anothermatch

* Adding new parsers is easy.

* Grammars are composable: you can put different`mixin(grammar(rules));` in a module and then grammars and rulescan refer to one another. That way, you can have utility grammarsproviding their functionalities to other grammars.* That's why Pegged comes with some pre-defined grammars (JSON,etc).

* Grammars can be dumped in a file to create a D module.

More advanced features, outside the standard PEG perimeter arethere to bring more power in the mix:

* Parametrized rules: `List(E, Sep) <- E (Sep E)*` is possible.The previous rule defines a parametrized parser taking two otherparsers (namely, `E` and `Sep`) to match a `Sep`-separated listof `E`'s.* Named captures: any parser can be named with the `=` operator.The parse tree generated by the parser (so, also its matches) isdelivered to the user in the output. Other parsers in the grammarsee the named captures too.* Semantic actions can be added to any rule in a grammar. Once arule has matched, its associated action is called on the ruleoutput and passed as final result to other parsers further up thegrammar. Do what you want to the parse tree. If the passedactions are delegates, they can access external variables.



Philippe

Pegged, a Parsing Expression Grammar (PEG) generator in D

Reply via email to