> I've never had any serious experience with operator precedence > parsing, but I have the intuition that this technique is going to be > quite unwieldy if we would like to go beyond simple expressions like > the ones you have shown. > > I'd advocate a more general approach to this problem, and namely > something like LL(1), SLR or LALR(1). When writing from scratch, this > is obviously going to require more effort than operator precedence but > instead we will have a much wider class of languages covered. > However, I'd recommend using a parser generator instead of writing a > parser from scratch yourself. It will introduce some slowdown, but > will instead make the whole thing a lot easier to write, maintain and, > more importantly, to extend. I'm not sure about the state of parser > generators for Python, but this page > http://wiki.python.org/moin/LanguageParsing may provide some > information. > Yeah I think I should use some parser generators. But doesn't it introduce a dependency on the parser generator. Is it ok to have this dependency?
> I can see a different problem here though: choosing a parsing method > and producing the grammar of the language (or what information the > parser would require) may not be enough to get the desired > functionality. What we are trying to get at should incorporate > something similar to natural language processing. Therefore, before > actually starting the parsing process, I think the input should be > brought to some kind of canonical form. I am strongly inclined to > believe that Wolfram|Alpha takes exactly this way of doing things (but > obviously with a lot more details). > One simple thing I can think of is detecting synonyms. For example, > "roots x**2 == 1" and "solutions x**2 == 1" and "solve x**2 == 1" are > one and the same thing. Therefore, it may be useful to start with > substituting each of these synonyms with some canonical word ("solve", > for example). It is possible to argue that this can be solved by > introducing additional rules into the grammar. However, I am inclined > to believe that this will make the grammar unnecessarily large and, > even worse, slow down the parser > Even I was thinking of the same. We have to substitute the synonyms before we start parsing. > . > > Another simple thing is detecting spelling errors. Suppose I type > "intgrate" or any of the other (numerous) wrong possibilities. I > think it would be very nice to have such mistakes detected and > corrected automatically. This http://packages.python.org/pyenchant/ > seems on topic. > > I'd also recommend showing the substitutions and the resulting > canonical form. In this way the user will be able to see how their > input was interpreted and, maybe, change their formulation to get > nearer to what they wanted. > This will be implemented just like Wolfram Alpha. We will show them what the input was interpreted as. > The list of actions the preprocessor should can be extended > arbitrarily, I guess. For example, it could try to fix wrongly > balanced parentheses. It might also try to cope with a different > order of keywords in the string, like in "sin(x) integral". It would > be nice to parse single-argument functions written without parentheses > ("sin x" instead of "sin(x)"). The preprocessor could also drop > incomprehensible (and thus supposedly meaningless) words, like in > "find the integral of x^2". > > Apparently, the elements in this list should also be given priorities, > because some of them are essential (synonyms, for example, as I see > it), others are less critical. > Thanks for your help. I think I have to look into all these ideas and prioritize them and come up with a plan to implement them. > Sergiu > > -- You received this message because you are subscribed to the Google Groups "sympy" group. To view this discussion on the web visit https://groups.google.com/d/msg/sympy/-/aNSWgICyxgoJ. To post to this group, send email to sympy@googlegroups.com. To unsubscribe from this group, send email to sympy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sympy?hl=en.