On 11-03-2012 00:28, Philippe Sigaud wrote:
Hello,

I created a new Github project, Pegged, a Parsing Expression Grammar
(PEG) generator in D.

https://github.com/PhilippeSigaud/Pegged

docs: https://github.com/PhilippeSigaud/Pegged/wiki

PEG: http://en.wikipedia.org/wiki/Parsing_expression_grammar

The idea is to give the generator a PEG with the standard syntax. From
this grammar definition, a set of related parsers will be created, to be
used at runtime or compile time.

Usage
-----

To use Pegged, just call the `grammar` function with a PEG and mix it
in. For example:


import pegged.grammar;

mixin(grammar("
Expr <- Factor AddExpr*
AddExpr <- ('+'/'-') Factor
Factor <- Primary MulExpr*
MulExpr <- ('*'/'/') Primary
Primary <- Parens / Number / Variable / '-' Primary

Parens <- '(' Expr ')'
Number <~ [0-9]+
Variable <- Identifier
"));



This creates the `Expr`, `AddExpr`, `Factor` (and so on) parsers for
basic arithmetic expressions with operator precedence ('*' and '/' bind
stronger than '+' or '-'). `Identifier` is a pre-defined parser
recognizing your basic C-style identifier. Recursive or mutually
recursive rules are OK (no left recursion for now).

To use a parser, use the `.parse` method. It will return a parse tree
containing the calls to the different rules:

// Parsing at compile-time:
enum parseTree1 = Expr.parse("1 + 2 - (3*x-5)*6");

pragma(msg, parseTree1.capture);
writeln(parseTree1);

// And at runtime too:
auto parseTree2 = Expr.parse(" 0 + 123 - 456 ");
assert(parseTree2.capture == ["0", "+", "123", "-", "456"]);



Features
--------

* The complete set of PEG operators are implemented
* Pegged can parse its input at compile time and generate a complete
parse tree at compile time. In a word: compile-time string (read: D
code) transformation and generation.
* You can parse at runtime also, you lucky you.
* Use a standard and readable PEG syntax as a DSL, not a bunch of
templates that hide the parser in noise.
* But you can use expression templates if you want, as parsers are all
available as such. Pegged is implemented as an expression template, and
what's good for the library writer is sure OK for the user too.
* Some useful additional operators are there too: a way to discard
matches (thus dumping them from the parse tree), to push captures on a
stack, to accept matches that are equal to another match
* Adding new parsers is easy.
* Grammars are composable: you can put different
`mixin(grammar(rules));` in a module and then grammars and rules can
refer to one another. That way, you can have utility grammars providing
their functionalities to other grammars.
* That's why Pegged comes with some pre-defined grammars (JSON, etc).
* Grammars can be dumped in a file to create a D module.

More advanced features, outside the standard PEG perimeter are there to
bring more power in the mix:

* Parametrized rules: `List(E, Sep) <- E (Sep E)*` is possible. The
previous rule defines a parametrized parser taking two other parsers
(namely, `E` and `Sep`) to match a `Sep`-separated list of `E`'s.
* Named captures: any parser can be named with the `=` operator. The
parse tree generated by the parser (so, also its matches) is delivered
to the user in the output. Other parsers in the grammar see the named
captures too.
* Semantic actions can be added to any rule in a grammar. Once a rule
has matched, its associated action is called on the rule output and
passed as final result to other parsers further up the grammar. Do what
you want to the parse tree. If the passed actions are delegates, they
can access external variables.


Philippe


Hm, since ' is used in the grammar of Pegged, how do I express it in my grammar spec? Is there a predefined rule for it, or is \' supposed to work?

--
- Alex

Reply via email to