[sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs

rjf Wed, 23 Feb 2011 09:37:57 -0800


On Feb 23, 9:17 am, "Dr. David Kirkby" <david.kir...@onetel.net>
wrote:
> On 02/22/11 10:57 PM, Dr. David Kirkby wrote:
>
>
>
> > On 02/22/11 03:49 PM, rjf wrote:
> >> A parser for the maxima language is not only easier to write,
> >> it is available in source form. It is also based on a well known
> >> technique which is also used by Reduce. The real difficulty is
> >> to implement a Mathematica language parser, since the language
> >> fails to fit the standard expectations for computer languages.
>
> > I know you said that, but I've herd different from another source. See
>
> >http://groups.google.com/group/comp.compilers/msg/8c4e6ccad3c40599
>
> > The person there, who is the CTO of a company producing this
>
> >http://www.semanticdesigns.com/Products/DMS/DMSToolkit.html
>
> > which has an option for a Mathematica parser (I assume the Mathematica
> > parser costs extra too).
>
> > He says Mathematica is not a particularly difficult language to parse,
> > and a GLR parser is a bit over the top.
>
> Here you can see a Mathematica parser is listed for the DMS toolkit
>
> http://www.semanticdesigns.com/Products/FrontEnds/index.html?Home=DMS...
>
> So I don't know what to believe Richard. You are saying the Mathematica 
> language
> can't be parsed with a conventional parser, so had to hand-write the parser 
> for
> MockMMA, yet someone from a commercial company selling this DMS toolkit claims
> the language is not particularly difficult to parse, and have a front end for
> their toolkit (a GLR parser) able to parse Mathematica.
>
> Clearly Wolfram|Alpha is a bit more clever, as it parsers written English and
> tries (sometimes not very successfully) to work with that.
>
> --
> A: Because it messes up the order in which people normally read text.
> Q: Why is top-posting such a bad thing?
> A: Top-posting.
> Q: What is the most annoying thing in e-mail?
>
> Dave


Here are my suggestions:

1. The guy is lying. He doesn't really have a Mathematica parser that
works.
2. The company has a really neat parser generating tool and a lot of
engineering
to go with it and Mathematica can be easily parsed with it.
3. The company has nothing much beyond a good term project in a
compiler-technology
course (perhaps at a graduate level) plus a bunch of engineering and
marketing.

My guess is 1 + 3

A VERY simple example.
r[s[]]

is legal in mathematica.
A traditional lexical analyzer, including the one apparently used by
mathematica,
typically looks for the longest string of characters that makes a
token.   Hence
a===b    has a token ===   which is  "SameQ"  even though there are
tokens =  and ==.
So the longest one is found, in general.

now in Mathematica,  s[[4]]  means take the 4th part of the list or
structure s.
s[4]  means (essentially)  call the function s on the argument 4.
{ really it has
to do with pattern matching too, but that's a nuance not needed here.}

anyway, how does one do lexical analysis or scanning on
r[s[]] ?

The correct tokenization is  r, [, s, [, ], ] .   but the maximal
token deal returns
r, [, s, [, ]] .

What does this mean?  It means that the conventional separation of
lexical analysis
and parsing must be intermixed in parsing Mathematica.

I know of no other programming language that requires this.

Oh, there are also other glitches in mathematica of this sort.

-- 
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org

[sage-devel] Re: Wolfram|Alpha appears to understand some Sage inputs

Reply via email to