Re: [sympy] Future plans for pattern matching

Ronan Lamy Tue, 04 May 2010 08:08:33 -0700

Le lundi 03 mai 2010 à 22:11 +0200, Sebastian a écrit :
> Thanks for all your comments, it's very helpful to get so thorough feedback.
> 
> On 05/03/2010 06:21 PM, Ronan Lamy wrote:
> > Le dimanche 02 mai 2010 à 09:46 -0700, basti a écrit :
> >   
> >> The last few days I spent quite some time on understanding the pattern
> >> matching and substitution logic in sympy and trying out ideas to
> >> improve them. Now I feel able and willing to redesign most of the
> >> stuff and will in the following give an overview about my plans.
> >>
> >>     
> > Interesting read, which means I have a lot of comments. I don't agree
> > with everything you've written, but I hope we can integrate something
> > like this in the master docs.
> >
> > Matching
> > --------
> >
> > The way you handle functions seems inconsistent, which is probably
> > caused by the pervasive confusion in the current code between functions
> > (e.g. sin) and their results (e.g. sin(x)). Specifically, here are some
> > results I'd like to see:
> > match(x, f) -> {f: x}    (because Symbols are callable)
> >   
> Yes in the moment symbols are python callables but does this really make
> them functions? What's the sense of having both if they describe
> mathematically the same? So I disagree with this.


Symbols aren't functions, they *can be* anything, including functions.
We'll probably need to develop a real notion of mathematical types at
some point to model this properly, but in any case having objects that
can be anything seems inevitable in order to parse strings into
expressions.

> > match(sin(x), f) -> None (because sin(x) is not a function)
> >   
> You are right - this makes more sense.
> > match(sin(x), f(u)) -> {f: sin, u: x}
> >   
> Of course ...
> > You don't tackle the major cause of unreliability in matching which is
> > finding all possible matches and ranking them. This is required not only
> > to get useful results (match(1+x**2, u+x**v) -> {u: x**2, v:0} is
> > correct but probably not what was expected) but also to get merely
> > correct ones when there are recursive calls to match.
> >   
> Yeah the implementation section is by far not completed. I've already
> more details written on paper which I'll add as soon as I find time.
> But to your concerns: I think it's not necessary to first find all
> possible solutions and then afterward rank them. It is possible to try
> the different combinations in such an order that as you say the
> "highest" ranked are tried first and if something matches it can be
> returned immediately.
> 
> > but also to get merely
> > correct ones when there are recursive calls to match.
> Sorry I don't understand what you mean by this. Can you maybe give an example?

I'm thinking of something like match(sin(a+b)*cos(b), sin(u+v)*cos(u)):
suppose matching goes from left to right, you get {u:a, b:v} from the
first term, plug it into the second, failure!
> 
> 
> > The Hints idea suffers from combinatorial explosion.
> Why combinatorial explosion? The implementation complexity will grow
> linearly with the amount of hints as long as the extended matching rules
> don't have to be permutated (for all cases I considered until now there
> seemed to be only one sane ordering). Or do you mean something else?

I was thinking of (identity, commutative, ...) * (Add, Mul, ...), which
is only O(m*n) so perhaps not really an explosion. The other problem is
what do you do if you want to apply a different set of hints to
different parts of the expression.

> >  You've only
> > mentioned Add, Mul and Pow, what about And, Or, Xor, Equivalent? What
> > about more complex objects like Integral or Sum? 
> >   
> There will be a fallback matching function that will be called if two
> expressions are of the same type. They match only if all it's args are
> matching. Of course And, Or, Xor are special in that sense that they are
> symmetric functions. They can be handled like Sum. Maybe there should be
> an attribute is_symmetric or something similar so that this can handled
> more general?
> > How can this be
> > extended for user-defined classes?
> I guess similarly to the printing case. But you are right this is maybe
> the strongest point of having match and subs directly implemented as
> methods all around sympy.
> > Substitution
> > ------------
> >
> > I think that structural substitution should be a separate function, not
> > just a subroutine handling a special case in subs. This would be useful
> > for common subexpression elimination, at least.
> >   
> What do you understand under structural substitution? Do you mean what I
> called atomic substitution?

I mean substitution based on structural matching, so it's similar to
atomic substitution, but without the restriction that it only applies to
atoms.
 
> > General remarks
> > ---------------
> >
> > You should take smaller steps but set more ambitious goals. Your design
> > isn't much different from the current one, so it'll suffer from the same
> > kind of problems. Try to forget the existing implementation and imagine
> > what the API should be and what results you expect.
> >   
> But I think the api is basically fine as it is now. The reason I started
> doing this is simply that I *didn't* get the results I was expecting.
> Therefore I wrote down the matching rules to clearly state what my
> expectations are and to find a common consent.
> > Don't base any work on commits that break tests. Good commits are easy
> > to understand, and therefore have a limited scope, and do not break any
> > tests. To merge the implementation of two functions, identify invariants
> > that link them, correct the situations where they break down, and then
> > factor out the duplications. Rinse and repeat as necessary. It might
> > seem more fastidious than doing the change you have in mind right away
> > and play whack-a-mole with the bugs that crop up, but it's usually
> > faster and easier that way.
> >   
> The implementation of subs and match by itself will be much easier to do
> without worrying about the old code and having to ensure after every
> commit that the old and the new algorithm interact perfectly. The
> question is if the final merging is then that much harder and I think it
> won't be. I succeeded in my "subs-first-try" branch to exactly do this
> and fixed most things in under a day. The problem was just that the
> matching algorithm was not good enough yet and I wanted to do it
> thoroughly from the beginning.

Define "succeeded", because under my definition you did not:
 tests finished: 1988 passed, 22 failed, 1 skipped, 51 expected to
fail, 
2 expected to fail but passed, 27 exceptions, in 452.17 seconds 
DO *NOT* COMMIT!


> > You haven't made the case for implementing match and subs as functions
> > living in a separate module. The problems I see with this are that this
> > is a crucial piece of functionality that, intuitively, belongs in the
> > core and that extending the system seems more difficult with functions
> > than with methods.
> >   
> It's fine to me that subs stays in Basic but I'm strongly for putting
> match into some extra module (but also in the directory sympy/core!).

OK. I thought you wanted to create a new subpackage. If match becomes a
function, this is reasonable.

> Here are my reasons:
>  
> (1) spreading complex code into many different places makes it much
> harder to understand th implementation.

OK, but spreading the implementation of a class into many different
places also makes it much harder to understand.
> 
> (2) it's clear where to put common functionality, e.g. all symmetric
> functions will use (nearly) identical algorithms.

... which could fit nicely in operations.py, next to each other.
> 
> (3) if I use hints in the implementation - and I plan to - then
> scattering them around will make it nearly impossible for the user to
> find out what hints there are and what they do. There are some possible
> solutions I can think of, but I don't like them too much.

Ordinary users don't read the source, so I don't think this matters
either way.
> 
> Maybe some other people can comment on this?

I guess this comes down to a matter of taste (grouping by class vs by
functionality) and implementation practicality...

> > Many of the existing methods are in need of a good cleanup. I think you
> > should start with this. It'll simplify further work on the subject and
> > you'll gain a better understanding of the details of the system.
> >   
> I think it's easier to rewrite the code than changing the current
> implementation.

I guess the customary answer to this is "Famous last words..."

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To post to this group, send email to sy...@googlegroups.com.
To unsubscribe from this group, send email to 
sympy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sympy?hl=en.

Re: [sympy] Future plans for pattern matching

Reply via email to