On Sunday, 2 August 2015 at 14:50:35 UTC, Jacob Carlborg wrote:
I'm trying to read the D grammar [1] to enhance the D TextMate
bundle. If we take the add expression as an example. It's
defined like this in the grammar:
AddExpression:
MulExpression
AddExpression + MulExpression
AddExpression - MulExpression
CatExpression
And like this in the grammar made by Brian [2]:
addExpression:
mulExpression
| addExpression ('+' | '-' | '~') mulExpression
;
I'm not so familiar with grammars but this looks like it's
recursive. Is it possible to translate this piece of grammar to
a regular expression? TextMate uses regular expressions and a
couple of enhancements/extensions to define a grammar for a
language.
[1] http://dlang.org/grammar.html
[2] https://rawgit.com/Hackerpilot/DGrammar/master/grammar.html
I guess you're not familiar with the theoretical aspect of
"formal languages". The D grammar is a context-free grammar which
cannot be reduced to a regular expression. As cym13 stated, there
are some simple context-free grammars which can be rewritten as
regular expressions, but the D grammar cannot be. Take a look at
the Chomsky Hierarchy [1] for a better understanding.
The classic example of a context-free language is the set of
balanced parenthesis, i.e. (()) is balanced and ())))) is not.
This language is not regular meaning you cannot write a regular
expression for it, but you can write a context-free grammar for
it.
[1] https://en.wikipedia.org/wiki/Chomsky_hierarchy#The_hierarchy