Raymond Hettinger added the comment:
> Isn't this already implemented?
No.
>>> class A:
pass
>>> d = dict.fromkeys('abcdefghi')
>>> a = A()
>>> a.__dict__.update(d)
>>> b = A()
>>> b.__dict__.update(d)
>>> import sys
>>> [sys.getsizeof(m) for m in [d, vars(a), vars(b)]]
[368, 648, 648]
>>> c = A()
>>> c.__dict__.update(d)
>>> [sys.getsizeof(m) for m in [d, vars(a), vars(b), vars(c)]]
[368, 648, 648, 648]
There is no benefit reported for key-sharing. Even if you make a thousand of
these instances, the size reported is the same. Here is the relevant code:
_PyDict_SizeOf(PyDictObject *mp)
{
Py_ssize_t size, usable, res;
size = DK_SIZE(mp->ma_keys);
usable = USABLE_FRACTION(size);
res = _PyObject_SIZE(Py_TYPE(mp));
if (mp->ma_values)
res += usable * sizeof(PyObject*);
/* If the dictionary is split, the keys portion is accounted-for
in the type object. */
if (mp->ma_keys->dk_refcnt == 1)
res += (sizeof(PyDictKeysObject)
- Py_MEMBER_SIZE(PyDictKeysObject, dk_indices)
+ DK_IXSIZE(mp->ma_keys) * size
+ sizeof(PyDictKeyEntry) * usable);
return res;
}
It looks like the fixed overhead is included for every instance of a
split-dictionary. Instead, it might make sense to take the fixed overhead and
divide it by the number of instances sharing the keys (averaging the overhead
across the multiple shared instances):
res = _PyObject_SIZE(Py_TYPE(mp)) / num_instances;
Perhaps use ceiling division:
res = -(- _PyObject_SIZE(Py_TYPE(mp)) / num_instances);
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue28508>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com