On Sunday, 2 August 2015 at 14:50:35 UTC, Jacob Carlborg wrote:
I'm trying to read the D grammar [1] to enhance the D TextMate bundle. If we take the add expression as an example. It's defined like this in the grammar:

AddExpression:
    MulExpression
    AddExpression + MulExpression
    AddExpression - MulExpression
    CatExpression

And like this in the grammar made by Brian [2]:

addExpression:
      mulExpression
    | addExpression ('+' | '-' | '~') mulExpression
    ;

I'm not so familiar with grammars but this looks like it's recursive. Is it possible to translate this piece of grammar to a regular expression? TextMate uses regular expressions and a couple of enhancements/extensions to define a grammar for a language.

[1] http://dlang.org/grammar.html
[2] https://rawgit.com/Hackerpilot/DGrammar/master/grammar.html

I guess you're not familiar with the theoretical aspect of "formal languages". The D grammar is a context-free grammar which cannot be reduced to a regular expression. As cym13 stated, there are some simple context-free grammars which can be rewritten as regular expressions, but the D grammar cannot be. Take a look at the Chomsky Hierarchy [1] for a better understanding.

The classic example of a context-free language is the set of balanced parenthesis, i.e. (()) is balanced and ())))) is not. This language is not regular meaning you cannot write a regular expression for it, but you can write a context-free grammar for it.

[1] https://en.wikipedia.org/wiki/Chomsky_hierarchy#The_hierarchy

Reply via email to