[sympy] Re: Smarter cache

2015-08-21 Thread Peter Brady
As of 0.7.6 sympy uses an LRU cache by default. cachey looks interesting but I think the "nbytes" may be challenging to compute for sympy objects and operations. There was some discussion of adopting a similar caching policy a while ago: https://github.com/sympy/sympy/issues/6321 On Wednesd

[sympy] Re: Smarter cache

2015-08-24 Thread Denis Akhiyarov
Nbytes is very hard in Python, and getsizeof() does not work very well. People has addressed this using github.com/pympler. Not sure if anyone tried it on sympy objects and how costly is that calculation. Cachey has very simple nbytes calculation, mainly intended for numpy and pandas objects. -

[sympy] Re: Smarter cache

2015-08-24 Thread Denis Akhiyarov
https://gist.github.com/denfromufa/4d0e6a94f70fac155b66.js";> On Monday, August 24, 2015 at 10:03:30 PM UTC-5, Denis Akhiyarov wrote: > > Nbytes is very hard in Python, and getsizeof() does not work very well. > People has addressed this using github.com/pympler. > Not sure if anyone tried it

[sympy] Re: Smarter cache

2015-08-24 Thread Denis Akhiyarov
It looks like pympler works pretty well on sympy symbols, here is my notebook: https://gist.github.com/denfromufa/4d0e6a94f70fac155b66 On Monday, August 24, 2015 at 10:03:30 PM UTC-5, Denis Akhiyarov wrote: > > Nbytes is very hard in Python, and getsizeof() does not work very well. > People ha

Re: [sympy] Re: Smarter cache

2015-08-25 Thread Peter Brady
Thanks for trying that out. I had never heard of pympler before. The current caching mechanism is based on hashing. By my tests, 'pympler.asizeof' is 500-1000x slower than hashing. That's a strong deficit for cachey to overcome (as far as sympy objects are concerned). In [1]: import sympy In

Re: [sympy] Re: Smarter cache

2015-08-25 Thread Denis Akhiyarov
pympler is very slow, hash is probably pure C, like fastcache. But it is understandable why it can get slow for collecting all this information in Python: asizeof(y1,stats=8) asizeof(((c/(3*a) - b**2/(9*a**2))/(sqrt((c/(3) + b**3/(27*a**3))**(1/3) - b/(3*a),), stats=8) ... 52136 bytes or

Re: [sympy] Re: Smarter cache

2015-08-25 Thread Aaron Meurer
Hashing in SymPy is done recursively (due to the nature of SymPy objects), but amounts to hashes of tuples of integers and strings, which is done in C. But it's also highly optimized: the hash is memoized and stored in __slots__. If we really cared about sizes of objects, we could probably do a si

Re: [sympy] Re: Smarter cache

2015-08-25 Thread Peter Brady
That's a good point. I had forgotten that sympy optimized the hash implementations. On Tue, Aug 25, 2015 at 6:50 PM, Aaron Meurer wrote: > Hashing in SymPy is done recursively (due to the nature of SymPy objects), > but amounts to hashes of tuples of integers and strings, which is done in > C.

Re: [sympy] Re: Smarter cache

2015-08-26 Thread Denis Akhiyarov
what is the heuristic? number of **Basic** sympy objects? On Tuesday, August 25, 2015 at 7:50:43 PM UTC-5, Aaron Meurer wrote: > > Hashing in SymPy is done recursively (due to the nature of SymPy objects), > but amounts to hashes of tuples of integers and strings, which is done in > C. But it's

Re: [sympy] Re: Smarter cache

2015-08-26 Thread Aaron Meurer
Probably count_ops() would be a close approximation of both how expensive an object is to create and how big it is (SymPy objects really shouldn't be doing much computation at creation time). Aaron Meurer On Wed, Aug 26, 2015 at 12:51 PM, Denis Akhiyarov wrote: > what is the heuristic? number o

Re: [sympy] Re: Smarter cache

2015-08-26 Thread Denis Akhiyarov
1. regarding count_ops, are we now jumping to computation cost? :) 2. if size of sympy objects is proportional to computation cost involving them, then cachey does not make sense for sympy at all. 3. not sure if computation cost should just be tracked using time() function? 4. i think it is po

Re: [sympy] Re: Smarter cache

2015-08-26 Thread Aaron Meurer
The whole point of the cache is to speed things up. Aaron Meurer On Wed, Aug 26, 2015 at 2:33 PM, Denis Akhiyarov wrote: > 1. regarding count_ops, are we now jumping to computation cost? :) > > 2. if size of sympy objects is proportional to computation cost involving > them, then cachey does no

Re: [sympy] Re: Smarter cache

2015-08-26 Thread Denis Akhiyarov
i agree, but cachey needs 2 main parameters as input: nbytes and computation cost. so using count_ops for both parameters is like reducing the formula down to LRU or something similar. in conclusion: * counts_ops or time() can be used for computation cost. ** __sizeof__ or number of Basic sym

Re: [sympy] Re: Smarter cache

2015-08-26 Thread Aaron Meurer
On Wed, Aug 26, 2015 at 5:18 PM, Denis Akhiyarov wrote: > i agree, but cachey needs 2 main parameters as input: nbytes and > computation cost. > > so using count_ops for both parameters is like reducing the formula down > to LRU or something similar. > I think you're right. Cachey would not real