On Tue, Mar 13, 2012 at 1:46 PM, Joachim Durchholz <j...@durchholz.org> wrote: > Am 13.03.2012 09:22, schrieb Aaron Meurer: > >> Thanks for the helpful advice. So which method would you recommend >> using for this? > > > Hard to tell right now. We need to see what grammars we want to support. > > So the first step would probably be to draw up all the grammar rules for > features present (and future, as far as we can fathom them). > Make that as complete as possible - it's sometimes the innocent rule that > will make or break a parsing technique.
So would it help to start a wiki page where we list all the things we want to support, in the order of importance? Here's a beginning of that list (in order): - SymPy syntax: This is probably obvious, but correct SymPy/Python syntax should always be parsed exactly as it is given. If the heuristic parser has ambiguities problems that would prevent this, we can just preparse with a call to sympify(), and only use heuristics if that fails. - Mathematica, Maple, Maxima, etc. syntax. Where they conflict, we should pick the more popular variant, or if that's nontrivial, we can just let it decide as a side-effect of the implementation (i.e., leave the behavior undefined in that case). - LaTeX. The ability to parse LaTeX math into something that can be computed would be very useful. WolframAlpha has support for this. - Text based math. What I mean here is, it should support parsing things as you would type them in plain text without trying to follow any kind of set grammar. Stuff like 3x^2 + 2, d/dx x^2. - Special symbols: Support stuff like √x or ∫x^2 dx. Allow, to some degree, pasting in stuff from the SymPy pretty printer (in particular, things that are not printed on more than one line, like 3⋅x₃). - Text based functions: Stuff like "integrate x^2 dx", "limit x^2 with respect to x as x approaches infinity". - Natural language processing: There is a vagary between this and the last bullet point. What I mean here is that it tries to guess what is meant from a plain text description without using a set grammar. This could even support stuff like "the integral of x squared with respect to x". I think we should support at least the first three bullet points. Supporting the last one or two bullet points will be the most difficult, and can be left out, at least in the initial implementation. We should also consider if a scheme will be extendable to such things in the long term. Shall I start a wiki page? I know there have been other things discussed here, like unary minus and bra-ket, that can be problems that are important to consider. Aaron Meurer > > Unless we're extremely lucky, that initial grammar will not work (except for > Earley, which will accept anything). > The next round will then be to see what modifications to the grammar are > needed to make it acceptable for the various parsing techniques. > For the ambiguity checking, any parser generator that uses a specific > parsing technique will do. Prefer one with a record of good error messages, > we'll need them. > > Then it's time for the judgement call: Which techniques require what > adaptations of the grammar, and which of the adaptations are acceptable and > which aren't. > > > -- > You received this message because you are subscribed to the Google Groups > "sympy" group. > To post to this group, send email to sympy@googlegroups.com. > To unsubscribe from this group, send email to > sympy+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/sympy?hl=en. > -- You received this message because you are subscribed to the Google Groups "sympy" group. To post to this group, send email to sympy@googlegroups.com. To unsubscribe from this group, send email to sympy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sympy?hl=en.