On Tue, Mar 13, 2012 at 10:18 PM, Joachim Durchholz <j...@durchholz.org> wrote:
> Am 13.03.2012 17:07, schrieb Sergiu Ivanov:
>
>>> A parser generator is required for LL and LR variants.
>>> The problem with these is that they usually come with their own syntax,
>>> so
>>> using them requires learning new skills.
>>
>>
>> I think the effort required to acquire these new skills is still less
>> than that of mastering a certain parsing technique and then
>> implementing it.  (Just for reference, I think I can remember I once
>> learnt the basics of JavaCC in a day.)
>
>
> You learnt the syntax of the tool.
> The hard part is dealing with conflicts.
> And to deal with conflicts, you need to have mastered the parsing technique.
> In the sense of "what does it actually do internally".
> At which point you're already fully equipped to roll your own parser.

Yes, you're right, I didn't initially take that into consideration.

>> On the other hand, you can switch the parsing techniques much easier
>> if you are using parser generators.
>
>
> No, not at all.
> All parser generators have a different input syntax, different mechanisms to
> attach semantic actions, etc.
> That's a decision that cannot be reversed easily.

I see.  I thought the differences in syntax were not that influential.

>>>>  I'm not sure about the state of parser
>>>> generators for Python, but this page
>>>> http://wiki.python.org/moin/LanguageParsing may provide some
>>>> information.
>>>
>>>
>>>
>>> A parser that's embedded into Python as a DSL would be an option.
>>
>>
>> I'm not sure I can properly understand your idea; why do we need to
>> *embed* the DSL into Python?
>
>
> Wikipedia:
> "embedded (or internal) domain-specific languages, implemented as libraries
> which exploit the syntax of their host general purpose language or a subset
> thereof, while adding domain-specific language elements (data types,
> routines, methods, macros etc.)."
>
> That is, have a few functions, and the grammar becomes a set of function
> calls that will contain the grammar, or generate the data to drive the
> parser engine.

Hm, I see.  I vaguely had a similar idea in my mind, but didn't know
the term; thank you :-)

>> and http://pypi.python.org/pypi/modgrammar
>
> - Cannot handle left recursion (BLOCKER)

Hm, thanks for pointing out.  I tried to check that yesterday and I
still cannot find the information in their docs (at least direct
searching for "recursion" doesn't work for me).

>>> It's easier to build alternate definitions into the semantic processing
>>> that
>>> happens after the parse.
>>
>>
>> Hm, I think I can see your point now.  It is indeed more correct to
>> process alternate definitions into the after-parse semantic
>> processing.  However, I'm not sure the resulting code is going to be
>> simpler.
>
>
> If you parse things, you need a place to link up a subtree with some
> semantics.
> Providing alternate syntaxes for the same semantics just means you have to
> assign the same semantics to some different kinds of subtrees. You just
> reuse the existing semantic object (callable, class, whatever). Maybe with a
> small adapter if the semantic object assumes a specific structure in the
> tree (most parser toolkits abstract away even from that).

I see; agreed.

>>> Having a larger grammar isn't so much a maintenance problem either if
>>> each
>>> rule follows a pattern from a small set of standard patterns.
>>
>>
>> I think I don't understand this.  Could you please give more details?
>
>
> It's no problem if you have a+b, a-b, a*b, a^b, a&&b, etc. etc.
> It *is* a problem if you have a-b, a*b, and -a, and want -a to bind tighter
> than a*b.
>
> It is not a problem to have a gazillion of control structures like
> while...end, switch, etc.
> It *is* a problem if you have if...then...else... (without a closing endif).
> Google for "dangling else".

Ah, I see what you mean now.

>>> Spell checking, again, is something that is easily done on the parse tree
>>> after it has been created.
>>
>>
>> I'm not sure I agree.  Consider the (supposedly valid) sentences
>> "integrate sin(x) by x" and "limit sin(x) when x goes to zero".  I
>> don't think I'd recommend parsing these two sentences with one
>> (general) rule, which means that the words "integrate" and "limit"
>> actually determine which of the rules to use.  If spell checking
>> doesn't happen before lexing, the necessary difference between
>> "integrate" and "limit" may not be detected.
>
>
> Where does spell checking factor in here?

I wanted to say that if "integrate" and "limit" belong to different
classes of symbols, then, should the lexer encounter "intgrate", it
wouldn't be able to correctly classify it as "integrate".

>>>> The preprocessor could also drop
>>>> incomprehensible (and thus supposedly meaningless) words, like in
>>>> "find the integral of x^2".
>>>
>>>
>>> Experience with this kind of approach was that it tends to reduce
>>> predictability. In this case, the user might consider a word meaningless
>>> but
>>> the program has some obscure definition for it and suddenly spits out
>>> error
>>> messages that refer to stuff that the user doesn't know about
>>
>>
>> I don't think this is a problem because the user is not supposed to
>> purposefully input meaningless words in normal scenarios.
>
>
> Then I don't understand what purpose the mechanism serves.

I was thinking about the situation when the user intuitively enters
some text which includes elements of a natural language (like "find"
and "the" in "find the limit sin(x)/x when x goes to 0").  In this
case the user thinks that all words are meaningless, but, for the
application, "find" and "the" bear no meaning.

Sergiu

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To post to this group, send email to sympy@googlegroups.com.
To unsubscribe from this group, send email to 
sympy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sympy?hl=en.

Reply via email to