On Tue, Mar 13, 2012 at 10:18 PM, Joachim Durchholz <j...@durchholz.org> wrote: > Am 13.03.2012 17:07, schrieb Sergiu Ivanov: > >>> A parser generator is required for LL and LR variants. >>> The problem with these is that they usually come with their own syntax, >>> so >>> using them requires learning new skills. >> >> >> I think the effort required to acquire these new skills is still less >> than that of mastering a certain parsing technique and then >> implementing it. (Just for reference, I think I can remember I once >> learnt the basics of JavaCC in a day.) > > > You learnt the syntax of the tool. > The hard part is dealing with conflicts. > And to deal with conflicts, you need to have mastered the parsing technique. > In the sense of "what does it actually do internally". > At which point you're already fully equipped to roll your own parser.
Yes, you're right, I didn't initially take that into consideration. >> On the other hand, you can switch the parsing techniques much easier >> if you are using parser generators. > > > No, not at all. > All parser generators have a different input syntax, different mechanisms to > attach semantic actions, etc. > That's a decision that cannot be reversed easily. I see. I thought the differences in syntax were not that influential. >>>> I'm not sure about the state of parser >>>> generators for Python, but this page >>>> http://wiki.python.org/moin/LanguageParsing may provide some >>>> information. >>> >>> >>> >>> A parser that's embedded into Python as a DSL would be an option. >> >> >> I'm not sure I can properly understand your idea; why do we need to >> *embed* the DSL into Python? > > > Wikipedia: > "embedded (or internal) domain-specific languages, implemented as libraries > which exploit the syntax of their host general purpose language or a subset > thereof, while adding domain-specific language elements (data types, > routines, methods, macros etc.)." > > That is, have a few functions, and the grammar becomes a set of function > calls that will contain the grammar, or generate the data to drive the > parser engine. Hm, I see. I vaguely had a similar idea in my mind, but didn't know the term; thank you :-) >> and http://pypi.python.org/pypi/modgrammar > > - Cannot handle left recursion (BLOCKER) Hm, thanks for pointing out. I tried to check that yesterday and I still cannot find the information in their docs (at least direct searching for "recursion" doesn't work for me). >>> It's easier to build alternate definitions into the semantic processing >>> that >>> happens after the parse. >> >> >> Hm, I think I can see your point now. It is indeed more correct to >> process alternate definitions into the after-parse semantic >> processing. However, I'm not sure the resulting code is going to be >> simpler. > > > If you parse things, you need a place to link up a subtree with some > semantics. > Providing alternate syntaxes for the same semantics just means you have to > assign the same semantics to some different kinds of subtrees. You just > reuse the existing semantic object (callable, class, whatever). Maybe with a > small adapter if the semantic object assumes a specific structure in the > tree (most parser toolkits abstract away even from that). I see; agreed. >>> Having a larger grammar isn't so much a maintenance problem either if >>> each >>> rule follows a pattern from a small set of standard patterns. >> >> >> I think I don't understand this. Could you please give more details? > > > It's no problem if you have a+b, a-b, a*b, a^b, a&&b, etc. etc. > It *is* a problem if you have a-b, a*b, and -a, and want -a to bind tighter > than a*b. > > It is not a problem to have a gazillion of control structures like > while...end, switch, etc. > It *is* a problem if you have if...then...else... (without a closing endif). > Google for "dangling else". Ah, I see what you mean now. >>> Spell checking, again, is something that is easily done on the parse tree >>> after it has been created. >> >> >> I'm not sure I agree. Consider the (supposedly valid) sentences >> "integrate sin(x) by x" and "limit sin(x) when x goes to zero". I >> don't think I'd recommend parsing these two sentences with one >> (general) rule, which means that the words "integrate" and "limit" >> actually determine which of the rules to use. If spell checking >> doesn't happen before lexing, the necessary difference between >> "integrate" and "limit" may not be detected. > > > Where does spell checking factor in here? I wanted to say that if "integrate" and "limit" belong to different classes of symbols, then, should the lexer encounter "intgrate", it wouldn't be able to correctly classify it as "integrate". >>>> The preprocessor could also drop >>>> incomprehensible (and thus supposedly meaningless) words, like in >>>> "find the integral of x^2". >>> >>> >>> Experience with this kind of approach was that it tends to reduce >>> predictability. In this case, the user might consider a word meaningless >>> but >>> the program has some obscure definition for it and suddenly spits out >>> error >>> messages that refer to stuff that the user doesn't know about >> >> >> I don't think this is a problem because the user is not supposed to >> purposefully input meaningless words in normal scenarios. > > > Then I don't understand what purpose the mechanism serves. I was thinking about the situation when the user intuitively enters some text which includes elements of a natural language (like "find" and "the" in "find the limit sin(x)/x when x goes to 0"). In this case the user thinks that all words are meaningless, but, for the application, "find" and "the" bear no meaning. Sergiu -- You received this message because you are subscribed to the Google Groups "sympy" group. To post to this group, send email to sympy@googlegroups.com. To unsubscribe from this group, send email to sympy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sympy?hl=en.