On 04.09.2012 21:21, Aaron Meurer wrote:
On Tue, Sep 4, 2012 at 1:02 AM, Juha Jeronen <juha.jero...@jyu.fi> wrote:
On 09/03/12 19:39, Aaron Meurer wrote:
On Sep 3, 2012, at 1:37 AM, Juha Jeronen<juha.jero...@jyu.fi>  wrote:

Once these are settled, I think I have enough information to prepare a
proper patch (except for practical details, such as the preferred patch
format).

Use a pull request on Github. We'll help you if you're new to git.

Ok.

I have used git, but not Github. I suppose I'll need to create an account
and then do something to create the pull request...

In any case, I'll prepare the patch locally first. I'll ping on this list
when I have a first version ready for review.
We've got a (slightly outdated in some areas) guide at
https://github.com/sympy/sympy/wiki/Development-workflow.  GitHub also
has great guides of its own.  You basically need to create an account
on GitHub, fork SymPy, add your ssh keys to GitHub, push your work to
a branch in your fork, and press the pull request button.  The SSH key
part tends to be the trickiest, but fortunately you only have to do
that once per computer, and GitHub has a pretty good guide on how to
do it.

Ok. Thanks for the info.


I cloned and installed the latest SymPy yesterday from git. It seems quite a few places call collect(). Just by a quick look at the source, it's hard to predict whether a recursive or non-recursive call would be better at each place. I guess I'll look at each instance more closely :)

And by the way, the tests for collect() look very extensive. I'll add some of my own for the new recursive version when I'm done.


There are two more substance details, which came up when I was thinking about this yesterday: automatically reordering given syms and handling number atoms.

About the first one, consider the example

a*(d*c*b + c*b + b + 1) + d*(a*b*c + b*c + c + 1)

If we collect this by ["b","c"] (specifically in that order!), we will get

a*(b*(c*(d + 1) + 1) + 1) + d*(b*c*(a + 1) + c + 1)

leaving an extra "c" in the second parenthetical expression. Collecting by ["c","b"] similarly leaves an extra "b" in the first one.

This is an actual example of what I vaguely mentioned earlier - that there are cases in which there does not exist a global optimal ordering for the syms to produce optimal collection.

Hence, let's allow re-ordering. If we reorder the given syms list independently in each subexpression, and collect by ["b","c"] (now in automatically determined, appropriate order for each subexpression) we obtain

a*(b*(c*(d + 1) + 1) + 1) + d*(c*(b*(a + 1) + 1) + 1)

which gets rid of the extra "c". The first parenthetical expression gets collected by ["b","c"] and the second by ["c","b"].


Now, it seems to me that this feature would be useful... so I extended my recursive_collect() accordingly. The new one has a "reorder" flag which affects the operation when the syms are given. When reordering is enabled, it basically runs the automatic syms generation, and filters the list to include only those syms that were given by the user (while preserving the ordering from the automatically generated list).

What would be the sane default for this option? Intuitively, at least I would expect the non-recursive version to do *exactly as it is told* (i.e. no reordering unless explicitly requested), whereas in the recursive version, it would make more sense to have reordering enabled by default in order for collect() to "do what I mean" in cases like the above.

Is this fine?

Why I think this functionality should be included is, that by using this it is possible to produce efficient recursive collection w.r.t. the given syms only. Something like this is needed, if the user wants to (somewhat optimally) collect e.g. only in variables, ignoring constants.


Moving on to the other topic, about the number atoms, is collect() intended to ignore numbers? To me it seems it does:

import sympy as sy
sy.collect( sy.sympify("2*a + 2*b + 2*c"), sy.sympify("2") )
=> 2*a + 2*b + 2*c

Also, this test

assert collect(-x/8 + x*y, -x) == x*(y - S(1)/8)

in simplify/tests/test_simplify.py, which produces the same result as collecting w.r.t. just "x" - and indeed, according to the assert, is expected to do so - seems to suggest that collect() is meant to ignore numbers.

NumberSymbols, OTOH, seem to be handled the same way as generic Symbols:

sy.collect( sy.sympify("pi*a + pi*b + 2*pi*c"), sy.sympify("pi") )
=> pi*(a + b + 2*c)

I'm asking because my automatic syms generator also currently ignores numbers. It would be possible to handle numbers, too, but that would require complicating the ordering a bit (symbols first, numbers last). If it's not needed, I won't add the extra functionality (and the extra bugs to go with it).


Finally, reading the latest source, I noticed some things about collect():

- it descends into a top-level Mul even when not recursive. Is this behaviour desired? (I think it's ok like this. It's just a bit surprising - maybe needs to be mentioned in the docstring.)

- expr is sympified, but only after some processing (conditional on the "if evaluate:") is already run on it. This processing assumes that expr is already an object tree; hence collect() errors out on string input if evaluate=True. Doing the sympification a few lines earlier would fix this and should have no negative side effects. If that's ok, I'll change this.


 -J

--
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To post to this group, send email to sympy@googlegroups.com.
To unsubscribe from this group, send email to 
sympy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sympy?hl=en.

Reply via email to