Oh I should say I profiled the following functions:

AssocOp.__new__
Mul.flatten
simplify
powsimp
_together (nested function in `together`)

On 22.04.2013 10:27, Tom Bachmann wrote:
Hi,

I used line_prof on the following script:

---------------------------------------------------------------
from __future__ import division
from sympy import *
from sympy.abc import s, t, x, y, z, a, b

q = symbols('q')
z = (q*(sqrt((2*q - S(7)/4)/sqrt(-2*(-(q - S(7)/8)**S(2)/8 -
S(2197)/13824)**(S(1)/3) - S(13)/12) + 2*(-(q - S(7)/8)**S(2)/8 -
S(2197)/13824)**(S(1)/3) - S(13)/6)/2 - S(1)/4) + q/4)

assert simplify(z) != 0
---------------------------------------------------------------

I attach the output. Here is my reading:

- simplify spends 70% in together and powsimp
- together spends 99% in object creation
- powsimp spends 95% in object creation
- AssocOp.__new__ spends 90% in flatten
- flatten has now obvious hotspot

What I can see is that flatten spends considerable amounts of time in
things which recursively call flatten *again* (i.e. things like a *= b).
This seems to indicate that it can probably benefit from memoization
caching.

Best,
Tom


On 20.04.2013 19:04, Aaron Meurer wrote:
On Fri, Apr 19, 2013 at 12:41 AM, Tom Bachmann<e_mc...@web.de> wrote:
On 19.04.2013 00:36, Aaron Meurer wrote:

On Tue, Apr 16, 2013 at 2:09 PM, Tom Bachmann<e_mc...@web.de> wrote:

Hi guys,

I did some investigations on issues with disabling caching.
Basically, I
ran
the tests in all our 323 test files, both with and without the
cache, and
timed each file separately. In each case, I used the --timeout=60
parameter
(which I believe should run all tests on my machine, when caching is
enabled).

First about correctness. In 23 of the test files, the test summary
(pass/fail/skip/xfai) with and without caching was different, this is
just
above 10%. Here are some (more or less arbitrary) examples:

sympy/core/tests/test_function.py - with caching an XFAI test passes,
without it does on. It boils down to Symbol('f')(x) == Symbol('f')(x)
working with caching and without. Particularly, in line 364 of
basic.py,
we
have an equality test of the form

if type(self) is not type(other): # after sympification
return false

[And this does not work unless we cache Symbol('f'), so that f(x) has
literally the same type, not just an equal type.]

sympy/external/tests/test_numpy.py - new failure without caching
This is probably a bug in the test, which is essentially
Symbol('x') is
Symbol('x').


Yes, this issue is probably the source of most of the caching
failures. There are also some subtleties with cmp still being used
for class comparison in the core that can give different behavior in
Python 3. If your project involves refactoring the core, you should
get rid of all this stuff (namely, core.py).


sympy/assumptions/tests/test_query.py - the tests without cache
timeout.
Running with timeout disabled shows that everything is working. It
turns
out
that compute_known_facts just is much slower without caching. I
tried to
figure out why, but am not quite sure. I attach some profiling output.
What
I can see is, that without caching, 20% time is spent in __new__ (my
guess
would be the aggressive forward chaining of the old assumptions
system is
at
fault, but I have no evidence for this). I don't see anything else
obvious,
so clearly more detailed investigation is needed.


It looks like a lot of time is spent sorting the args of And and Or.
We should very carefully check the algorithms. If they don't actually
need the args to be sorted, they should just use the internal _argset.

And regardless, this really is the sort of thing that *should* be
cached. In fact, it probably should just be stored on the object
itself, like


Exactly. In any case, this is *valid* use of caching, even in the
presence
of assumptions (as long as we define the meaning of ordered
correctly, i.e.
based on hashes not mathematical order).

Sure. It will obviously be even faster if we don't sort the args at
all, though.




I can't reason about these diagrams very well. What was the input?


Neither can I. They are exceptionally unhelpful somehow (I have used
then
with good effect before, to identify and improve hotspots, but here
there
just doesn't seem to be anything obvious. Which is also to be expected,
since obvious things would already have been fixed.)

The test case is just

q = symbols('q')
z = (q*(-sqrt(-2*(-(q - S(7)/8)**S(2)/8 - S(2197)/13824)**(S(1)/3) -
S(13)/12)/2 - sqrt((2*q - S(7)/4)/sqrt(-2*(-(q - S(7)/8)**S(2)/8 -
S(2197)/13824)**(S(1)/3) - S(13)/12) + 2*(-(q - S(7)/8)**S(2)/8 -
S(2197)/13824)**(S(1)/3) - S(13)/6)/2 - S(1)/4) + q/4 + (-sqrt(-2*(-(q
- S(7)/8)**S(2)/8))))
z.equals(0)
simplify(z)

Why do you test both equals and simplify? These are completely
different code paths.

Also, how did you make those charts, by the way? You might give
lineprofiler a try as well. I've found it's output can be orthogonally
useful to the output of CProfilier. CProfilier is good at finding
functions that serve as bottlenecks or inner loops, for which it makes
sense to micro-optimize. lineprofiler won't do this if the function is
called from many places, but it does tell you about specific code
calls that are slow.

Aaron Meurer



Tom

--
You received this message because you are subscribed to the Google
Groups
"sympy" group.
To unsubscribe from this group and stop receiving emails from it,
send an
email to sympy+unsubscr...@googlegroups.com.
To post to this group, send email to sympy@googlegroups.com.
Visit this group at http://groups.google.com/group/sympy?hl=en-US.
For more options, visit https://groups.google.com/groups/opt_out.





--
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sympy+unsubscr...@googlegroups.com.
To post to this group, send email to sympy@googlegroups.com.
Visit this group at http://groups.google.com/group/sympy?hl=en-US.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to