Hashing in SymPy is done recursively (due to the nature of SymPy objects),
but amounts to hashes of tuples of integers and strings, which is done in
C. But it's also highly optimized: the hash is memoized and stored in
__slots__.

If we really cared about sizes of objects, we could probably do a similar
thing. And it is probably sufficient to use heuristics rather than a true
sizeof.

Aaron Meurer

On Tue, Aug 25, 2015 at 6:12 PM, Denis Akhiyarov <denis.akhiya...@gmail.com>
wrote:

> pympler is very slow, hash is probably pure C, like fastcache.
>
> But it is understandable why it can get slow for collecting all this
> information in Python:
>
> asizeof(y1,stats=8)
>
> asizeof(((c/(3*a) - b**2/(9*a**2))/(sqrt((c/(3....) + b**3/(27*a**3))**(1/3) 
> - b/(3*a),), stats=8) ...
>  52136 bytes or 50.9 KiB
>      8 byte aligned
>      8 byte sizeof(void*)
>      1 object given
>    222 objects sized
>   1840 objects seen
>     24 recursion depth
>
>     15 profiles:  total (% of grand total), average, and largest flat size:  
> largest object
>     42 class sympy.core.assumptions.StdFactKB objects:  42048 or 41.1 KiB 
> (81%), 1001, 1632 or 1.6 KiB:  {'prime': False, 'infinite': False, 
> 'r....maginary': False, 'irrational': False} leng 32!
>     44 class str objects:  2632 or 2.6 KiB (5%), 59, 64:  'infinite' leng 9!
>     28 class tuple objects:  1888 or 1.8 KiB (4%), 67, 80:  (sqrt((c/(3*a) - 
> b**2/(9*a**2))**3 + (..../(2*a), b**3/(27*a**3), -b*c/(6*a**2)) leng 4
>     49 class int objects:  1848 or 1.8 KiB (4%), 37, 40:  5976377932654160047 
> leng 2!
>     12 class sympy.core.mul.Mul objects:  864 (2%), 72, 72:  -(sqrt((c/(3*a) 
> - b**2/(9*a**2))**3 + .... b*c/(6*a**2) + b**3/(27*a**3))**(1/3)
>     10 class sympy.core.power.Pow objects:  720 (1%), 72, 72:  (sqrt((c/(3*a) 
> - b**2/(9*a**2))**3 + (.... b*c/(6*a**2) + b**3/(27*a**3))**(1/3)
>      7 class sympy.core.numbers.Rational objects:  560 (1%), 80, 80:  -1/9
>      5 class sympy.core.add.Add objects:  360 (1%), 72, 72:  (c/(3*a) - 
> b**2/(9*a**2))/(sqrt((c/(3*....*2) + b**3/(27*a**3))**(1/3) - b/(3*a)
>      4 class sympy.core.numbers.Integer objects:  352 (1%), 88, 88:  -2
>      4 class pympler.asizeof._Slots objects:  336 (1%), 84, 88:  ('p', 'q', 
> '_assumptions', '_args', '_mhash') leng 4
>      4 class sympy.core.symbol.Symbol objects:  288 (1%), 72, 72:  a
>      1 class sympy.core.numbers.NegativeOne object:  88 (0%), 88, 88:  -1
>      1 class sympy.core.numbers.Half object:  80 (0%), 80, 80:  1/2
>      2 class bool objects:  56 (0%), 28, 32:  True
>      1 class NoneType object:  16 (0%), 16, 16:  None
>
>     42 static types:  basicsize, itemsize, _len_(), _refs()
>        class Exception:  88, 0, n/a, _exc_refs
>        class NoneType:  16, 0, n/a, n/a
>        class NotImplementedType:  16, 0, n/a, n/a
>        class Struct:  56, 1, _len_struct, n/a
>        class array.array:  64, 1, _len_array, n/a
>        class bool:  32, 4, n/a, n/a
>        class bytearray:  56, 1, _len_bytearray, n/a
>        class bytearray_iterator:  56, 0, _len_iter, _iter_refs
>        class callable_iterator:  56, 0, _len_iter, _iter_refs
>        class complex:  32, 0, n/a, n/a
>        class dict:  64, 24, _len_dict, _dict_refs
>        class dict_itemiterator:  80, 0, _len_iter, _iter_refs
>        class dict_keyiterator:  80, 0, _len_iter, _iter_refs
>        class dict_valueiterator:  80, 0, _len_iter, _iter_refs
>        class ellipsis:  16, 0, n/a, n/a
>        class enumerate:  72, 0, n/a, _enum_refs
>        class float:  24, 0, n/a, n/a
>        class frozenset:  224, 16, _len_set, _seq_refs
>        class getset_descriptor:  72, 0, n/a, n/a
>        class int:  24, 4, _len_int, n/a
>        class list:  64, 8, _len_list, _seq_refs
>        class list_iterator:  56, 0, _len_iter, _iter_refs
>        class list_reverseiterator:  56, 0, _len_iter, _iter_refs
>        class mappingproxy:  48, 24, _len_dict, _dict_refs
>        class member_descriptor:  72, 0, n/a, n/a
>        class module:  88, 48, _len_module, _module_refs
>        class os.stat_result:  48, 8, n/a, _stat_refs
>        class property:  80, 0, n/a, _prop_refs
>        class pympler.asizeof._Slots:  56, 8, _len_slots, n/a
>        class range:  48, 0, n/a, n/a
>        class reversed:  56, 0, n/a, _enum_refs
>        class set:  224, 16, _len_set, _seq_refs
>        class set_iterator:  72, 0, _len_iter, _iter_refs
>        class slice:  40, 8, _len_slice, n/a
>        class str:  80, 2, _len_unicode, n/a
>        class str_iterator:  56, 0, _len_iter, _iter_refs
>        class traceback:  64, 0, n/a, _tb_refs
>        class tuple:  48, 8, _len, _seq_refs
>        class tuple_iterator:  56, 0, _len_iter, _iter_refs
>        class weakproxy:  80, 0, n/a, n/a
>        class weakref:  80, 0, n/a, _weak_refs
>        class weakref.KeyedRef:  88, 0, n/a, _weak_refs
>
>      8 dynamic types:  basicsize, itemsize, _len_(), _refs()
>        class sympy.core.add.Add:  72, 0, n/a, _inst_refs
>        class sympy.core.mul.Mul:  72, 0, n/a, _inst_refs
>        class sympy.core.numbers.Half:  80, 0, n/a, _inst_refs
>        class sympy.core.numbers.Integer:  88, 0, n/a, _inst_refs
>        class sympy.core.numbers.NegativeOne:  88, 0, n/a, _inst_refs
>        class sympy.core.numbers.Rational:  80, 0, n/a, _inst_refs
>        class sympy.core.power.Pow:  72, 0, n/a, _inst_refs
>        class sympy.core.symbol.Symbol:  72, 0, n/a, _inst_refs
>
>      1 derived type:  basicsize, itemsize, _len_(), _refs()
>        class sympy.core.assumptions.StdFactKB:  64, 24, _len_dict, _dict_refs
>
>      4 dict/-like classes:
>        UserDict:  (IterableUserDict, UserDict)
>        weakref:  (WeakKeyDictionary, WeakValueDictionary)
>
>
>
> On Tuesday, August 25, 2015 at 11:38:45 AM UTC-5, Peter Brady wrote:
>
>> Thanks for trying that out.  I had never heard of pympler before.  The
>> current caching mechanism is based on hashing.  By my tests,
>> 'pympler.asizeof' is 500-1000x slower than hashing.  That's a strong
>> deficit for cachey to overcome (as far as sympy objects are concerned).
>>
>> In [1]: import sympy
>>
>> In [2]: from sympy.abc import a, b, c, d, e, x, y
>>
>> In [3]: from pympler.asizeof import asizeof
>>
>> In [4]: y=a*x**3+b*x**2+c*x+d
>>
>> In [5]: y1, y2, y3 = sympy.solve(y, x, check=False)
>>
>> In [6]: %time asizeof(y1)
>> CPU times: user 9.63 ms, sys: 0 ns, total: 9.63 ms
>> Wall time: 9.56 ms
>> Out[6]: 52608
>>
>> In [7]: %time hash(y1)
>> CPU times: user 14 µs, sys: 1 µs, total: 15 µs
>> Wall time: 19.8 µs
>> Out[7]: 5743556980832125790
>>
>> In [8]: y=a*x**4+b*x**3+c*x**2+d*x+e
>>
>> In [9]: y1,y2,y3,y4=sympy.solve(y,x,check=False)
>>
>> In [10]: %time asizeof(y4)
>> CPU times: user 16.7 ms, sys: 2.05 ms, total: 18.8 ms
>> Wall time: 18.6 ms
>> Out[10]: 85208
>>
>> In [11]: %time hash(y4)
>> CPU times: user 14 µs, sys: 1 µs, total: 15 µs
>> Wall time: 19.8 µs
>> Out[11]: 4388441583750016728
>>
>>
>> On Mon, Aug 24, 2015 at 10:49 PM, Denis Akhiyarov <denis.a...@gmail.com>
>> wrote:
>>
>>> It looks like pympler works pretty well on sympy symbols, here is my
>>> notebook:
>>>
>>> https://gist.github.com/denfromufa/4d0e6a94f70fac155b66
>>>
>>>
>>> On Monday, August 24, 2015 at 10:03:30 PM UTC-5, Denis Akhiyarov wrote:
>>>>
>>>> Nbytes is very hard in Python, and getsizeof() does not work very well.
>>>> People has addressed this using github.com/pympler.
>>>> Not sure if anyone tried it on sympy objects and how costly is that
>>>> calculation. Cachey has very simple nbytes calculation, mainly intended for
>>>> numpy and pandas objects.
>>>>
>>> --
>>> You received this message because you are subscribed to a topic in the
>>> Google Groups "sympy" group.
>>> To unsubscribe from this topic, visit
>>> https://groups.google.com/d/topic/sympy/slKi02rzXVE/unsubscribe.
>>> To unsubscribe from this group and all its topics, send an email to
>>> sympy+un...@googlegroups.com.
>>> To post to this group, send email to sy...@googlegroups.com.
>>> Visit this group at http://groups.google.com/group/sympy.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/sympy/6e7f1a64-e8b9-46b9-9dd8-28f18de3a416%40googlegroups.com
>>> <https://groups.google.com/d/msgid/sympy/6e7f1a64-e8b9-46b9-9dd8-28f18de3a416%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sympy+unsubscr...@googlegroups.com.
> To post to this group, send email to sympy@googlegroups.com.
> Visit this group at http://groups.google.com/group/sympy.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sympy/1ad279e2-41a6-44ec-bb8b-523b59cbca3b%40googlegroups.com
> <https://groups.google.com/d/msgid/sympy/1ad279e2-41a6-44ec-bb8b-523b59cbca3b%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sympy+unsubscr...@googlegroups.com.
To post to this group, send email to sympy@googlegroups.com.
Visit this group at http://groups.google.com/group/sympy.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sympy/CAKgW%3D6KSQyW6dzgu5cwpxi3y1dM_dCVYpnQYo9oeOdGWStPL%2Bg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to