Am 13.03.2012 23:40, schrieb Aaron Meurer:
I see that http://en.wikipedia.org/wiki/Earley_parser lists four different
Python implementations, one of them just 150 lines.

Just about all of them are relatively short.

Ah, good to know, I checked only the last one because it was announced as short in WP.

> I suppose it wouldn't be
hard, then, to just implement this from scratch.

Agreed.

Or preparse and replace ∫ with "integrate" and so on.

Not necessary. Just define ∫ as an alias of "integrate" at the semantic level. The advantage is that you avoid any questions of how this affects lexing and parsing, and it's really easy to implement.

Subscripts have no syntactical meaning, so those should actually just
be considered part of the Symbol name

Agreed.

> (maybe translated from "₃" to "_3").

Just leave the symbol as it is.
Somebody might have both x₃ and x_3 in the same formula.

The general rule with syntax is: don't mess with the text unless you really need to. It introduces all kinds of problems, from introducing subtle ambiguities that weren't in the original input to making it hard to produce error messages that are written in terms of the original user input.

OK, I started https://github.com/sympy/sympy/wiki/parsing.  I included
what I said above, and also your bit about the different parsers.
Feel free to edit that page however you want.

Okay, done.
The relevant part is that if we want to do ambiguous grammar, we need Earley. The only serious problem that I see is that we might find that whatever the people enter, it will be ambiguous. I guess standard expressions like a+b will work fine, but [a,b] might be interpreted as list, parameter list, interval, and whatnot, so the system would spit out ambiguities for each and every nontrivial input. I can see various ways to deal with that kind of problem, but I guess we can wait and see - it might be less of a problem after all. If it turns out to be a showstopper, we can always remove grammar rules, losing a bit of compatibility with the syntax it was taken from and gaining a reduction in ambiguity.

Somebody needs to go out and collect the grammar rules for all the grammars that we want to parse.

--
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To post to this group, send email to sympy@googlegroups.com.
To unsubscribe from this group, send email to 
sympy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sympy?hl=en.

Reply via email to