On 19.04.2013 00:36, Aaron Meurer wrote:
On Tue, Apr 16, 2013 at 2:09 PM, Tom Bachmann<e_mc...@web.de>  wrote:
Hi guys,

I did some investigations on issues with disabling caching. Basically, I ran
the tests in all our 323 test files, both with and without the cache, and
timed each file separately. In each case, I used the --timeout=60 parameter
(which I believe should run all tests on my machine, when caching is
enabled).

First about correctness. In 23 of the test files, the test summary
(pass/fail/skip/xfai) with and without caching was different, this is just
above 10%. Here are some (more or less arbitrary) examples:

sympy/core/tests/test_function.py - with caching an XFAI test passes,
without it does on. It boils down to Symbol('f')(x) == Symbol('f')(x)
working with caching and without. Particularly, in line 364 of basic.py, we
have an equality test of the form

if type(self) is not type(other): # after sympification
     return false

[And this does not work unless we cache Symbol('f'), so that f(x) has
literally the same type, not just an equal type.]

sympy/external/tests/test_numpy.py - new failure without caching
This is probably a bug in the test, which is essentially Symbol('x') is
Symbol('x').

Yes, this issue is probably the source of most of the caching
failures.  There are also some subtleties with cmp still being used
for class comparison in the core that can give different behavior in
Python 3. If your project involves refactoring the core, you should
get rid of all this stuff (namely, core.py).


sympy/assumptions/tests/test_query.py - the tests without cache timeout.
Running with timeout disabled shows that everything is working. It turns out
that compute_known_facts just is much slower without caching. I tried to
figure out why, but am not quite sure. I attach some profiling output. What
I can see is, that without caching, 20% time is spent in __new__ (my guess
would be the aggressive forward chaining of the old assumptions system is at
fault, but I have no evidence for this). I don't see anything else obvious,
so clearly more detailed investigation is needed.

It looks like a lot of time is spent sorting the args of And and Or.
We should very carefully check the algorithms. If they don't actually
need the args to be sorted, they should just use the internal _argset.

And regardless, this really is the sort of thing that *should* be
cached. In fact, it probably should just be stored on the object
itself, like


Exactly. In any case, this is *valid* use of caching, even in the presence of assumptions (as long as we define the meaning of ordered correctly, i.e. based on hashes not mathematical order).


I can't reason about these diagrams very well. What was the input?


Neither can I. They are exceptionally unhelpful somehow (I have used then with good effect before, to identify and improve hotspots, but here there just doesn't seem to be anything obvious. Which is also to be expected, since obvious things would already have been fixed.)

The test case is just

q = symbols('q')
z = (q*(-sqrt(-2*(-(q - S(7)/8)**S(2)/8 - S(2197)/13824)**(S(1)/3) -
S(13)/12)/2 - sqrt((2*q - S(7)/4)/sqrt(-2*(-(q - S(7)/8)**S(2)/8 -
S(2197)/13824)**(S(1)/3) - S(13)/12) + 2*(-(q - S(7)/8)**S(2)/8 -
S(2197)/13824)**(S(1)/3) - S(13)/6)/2 - S(1)/4) + q/4 + (-sqrt(-2*(-(q
- S(7)/8)**S(2)/8))))
z.equals(0)
simplify(z)

Tom

--
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sympy+unsubscr...@googlegroups.com.
To post to this group, send email to sympy@googlegroups.com.
Visit this group at http://groups.google.com/group/sympy?hl=en-US.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to