Re: [sympy] feedback for GSOC 2012 idea

Joachim Durchholz Tue, 13 Mar 2012 13:58:18 -0700

Am 13.03.2012 21:17, schrieb Aaron Meurer:

So would it help to start a wiki page where we list all the things we
want to support, in the order of importance?

> Here's a beginning of that list (in order):


- SymPy syntax:  This is probably obvious, but correct SymPy/Python
syntax should always be parsed exactly as it is given.  If the
heuristic parser has ambiguities problems that would prevent this, we
can just preparse with a call to sympify(), and only use heuristics if
that fails.

- Mathematica, Maple, Maxima, etc. syntax. Where they conflict, we
should pick the more popular variant, or if that's nontrivial, we can
just let it decide as a side-effect of the implementation (i.e., leave
the behavior undefined in that case).

- LaTeX.  The ability to parse LaTeX math into something that can be
computed would be very useful.  WolframAlpha has support for this.

It's almost guaranteed that combining syntaxes from different sourcesgives an ambiguous grammar. The only technique that can deal with thatwould those in the succession of the Earley parser.

I see that http://en.wikipedia.org/wiki/Earley_parser lists fourdifferent Python implementations, one of them just 150 lines.

- Text based math.  What I mean here is, it should support parsing
things as you would type them in plain text without trying to follow
any kind of set grammar.  Stuff like 3x^2 + 2, d/dx x^2.

That's really hard to do well. Most of the time, the users's guess ofthe parser's guess will be quite different than the actuall guess of theparser.

- Special symbols: Support stuff like √x or ∫x^2 dx.  Allow, to some
degree, pasting in stuff from the SymPy pretty printer (in particular,
things that are not printed on more than one line, like 3⋅x₃).

That's simple. Just plop in the appropriate grammar rules. Make √ aprefix operator, ∫...dx a "circumfix" one.₃ would probably have to be lexed as <sub>3<endsub>, where <sub> and<endsub> are synthetic lexer symbols.

- Text based functions:  Stuff like "integrate x^2 dx", "limit x^2
with respect to x as x approaches infinity".

- Natural language processing:  There is a vagary between this and the
last bullet point.  What I mean here is that it tries to guess what is
meant from a plain text description without using a set grammar.  This
could even support stuff like "the integral of x squared with respect
to x".


The same caveat as with "text-based math" apply.

Shall I start a wiki page?  I know there have been other things
discussed here, like unary minus and bra-ket, that can be problems
that are important to consider.


I see two things that need a decision:

1) Whether supporting such a wide array of syntaxes is such an importantgoal.

If yes, Earley parsing it is.

If no, it would be defining our own syntax, possibly similar to existingsymbolic math languages, but still a separate syntax.

From user's perspective, the consequence of an Earley parser would bethat an additional error mode: the input text might have multiple validparses.

(How to best present that to the user might be one or more GSoC projects.)

The consequence of a non-Earley parser, regardless of technology, wouldbe that we'd have to drastically cut down on the allowed syntax.Essentially, we'd have to resolve all potential syntactic ambiguitieswhen writing the grammar.


I think this decision does not benefit from a Wiki.

2) Which grammar rules to support.

This is a bit tedious: look up what the various syntaxes have, write itdown.


A wiki page would be useful for that.

--
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To post to this group, send email to sympy@googlegroups.com.
To unsubscribe from this group, send email to 
sympy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sympy?hl=en.

Re: [sympy] feedback for GSOC 2012 idea

Reply via email to