New submission from Mark Shannon <m...@hotpy.org>:
https://bugs.python.org/issue45340 and https://github.com/python/cpython/pull/28802 allowed "virtual" object dicts (see faster-cpython/ideas#72 for full details). In order for this to work, we need to keep the insertion order on the values. The initial version (https://github.com/python/cpython/pull/28802) used a 64 bit value as a vector of 16 4-bit values, which allows only 16 items per values array. Stats gathered from the standard benchmark suite and informal evidence from elsewhere suggests that this causes a significant (5% and upwards) of these dicts to be materialized due to exceeding the 16 item limit. An alternative design that would allow up to ~254 items in the values array is to make the insertion order vector an array of bytes. The capacity is 254 as we need a byte for size, and another for capacity. This will increase the size of the values a bit for sizes from 7 to 15, but save a lot of memory for sizes 17+, as keys could still be shared. Pros: No need to materialize dicts of size 16+, saving ~3/4 of the memory per dict and helping specialization. Cons: Extra memory write to store a value* 1 extra word for values of size 7 to 14, 2 extra for size 15. Some extra complexity. *In a hypothetical optimized JIT, the insertion order vector would be stored as a single write for several writes, so this would make no difference. ---------- assignee: Mark.Shannon messages: 412735 nosy: Mark.Shannon priority: normal severity: normal status: open title: Allow more than 16 items in split-keys dicts and "virtual" object dicts. versions: Python 3.11 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue46675> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com