The last few days I spent quite some time on understanding the pattern
matching and substitution logic in sympy and trying out ideas to
improve them. Now I feel able and willing to redesign most of the
stuff and will in the following give an overview about my plans.

The ultimate goal is to have a matching algorithm which is mostly
described here:
http://github.com/bastikr/sympy/blob/match/doc/src/modules/matching.txt

And a substitution algorithm described here:
http://github.com/bastikr/sympy/blob/subs/doc/src/modules/substitution.txt

The best way to read these documents might be to fetch the
corresponding branches and build the docs to get nice html output.
(But since it's restructured text you can also read them directly)

Maybe a short summary of the algorithms:

 * matching algorithm:
The matching rules will be controllable by hints which can be given to
the main match function. This function will check that all input
arguments are valid and pass on to a dispatching function which will
call for different input arguments specialized functions. (E.g there
are specialized functions for generic functions, Add, Mul, Pow, ...).
These specialized functions will try to match subexpressions to
corresponding subpatterns using special rules according to the given
hints. If something is matched a dictionary with the matched wilds is
returned (it's empty if there are no wilds) or None if the expression
does not match the pattern.

* substitution algorithm
Substitution distinguishes between matching patterns which consist of
a single atomic object and complex patterns. Atomic substitution does
not care if the atom is a wild symbol or not - it stupidly just
replaces any occurrence by the substitution expression. This means
it's pretty fast but also less powerful than pattern matching. Pattern
matching tries to match subexpressions to the given pattern (using the
match algorithm) and if something matches it gets atomic substitution
rules for the wilds in the pattern. These rules are applied to the
substitution expression and the result is used as substitution for the
matched part in the expression.
E.g. this allows the following (x is a symbol, u is a Wild):

>>> subs(2+x**4, c+x**u, u*x**(u-1))
4*x**3


Now what will be the steps to achieve this aim:

(1) Implement a generic "Hint"s model which can be used to store
values of different hints and also be used to find out interactively
what hints are possible. (This is already done in the hints branch and
might also be useful for other parts in sympy)

(2) Implement the atomic substitution method described in
substitution.txt (This can either be done directly in the Basic class
or alternatively in some separate module)

(3) Complete the description of the matching algorithm in matching.txt

(4) Implement the matching algorithm described in matching.txt in a
separate module. (This relies on the Hints model and the atomic
substitution method)

(5) Implement the pattern substitution method described in
substitution.txt (Again either in a separate module together with the
atomic substitution method or directly in Basic)

(6) Write the wrapping "subs" function.

(7) Plug this new "subs" function in instead of the currently used
subs function. Some tests are expected to fail: Those which test
directly the substitution function have to be reviewed and might have
to be changed simply because other substitution behavior is expected
of the new substitution method.
Other test will fail because some algorithms will have to be adapted
to use the new substitution.

(8) Remove old substitution code.

(9) Do (7) and (8) with the "match" function.

I'm not completely sure when it is a good point to merge back into
master. Maybe only after (7) and after (9).

This is a pretty ambitious undertaking and might take some month to
accomplish. But before I start spending more time on this I want hear
the opinion of other developers. Do you think it's worth doing? And if
yes, do you have any tips, wishes or comments? Also help is of course
very much welcomed, just tell me what do you want to work on so that
we do not double the work!

Sebastian

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To post to this group, send email to sy...@googlegroups.com.
To unsubscribe from this group, send email to 
sympy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sympy?hl=en.

Reply via email to