New submission from Mark Shannon <m...@hotpy.org>:

https://bugs.python.org/issue45340 and 
https://github.com/python/cpython/pull/28802 allowed "virtual" object dicts 
(see faster-cpython/ideas#72  for full details).

In order for this to work, we need to keep the insertion order on the values. 
The initial version (https://github.com/python/cpython/pull/28802) used a 64 
bit value as a vector of 16 4-bit values, which allows only 16 items per values 
array.

Stats gathered from the standard benchmark suite and informal evidence from 
elsewhere suggests that this causes a significant (5% and upwards) of these 
dicts to be materialized due to exceeding the 16 item limit.

An alternative design that would allow up to ~254 items in the values array is 
to make the insertion order vector an array of bytes. The capacity is 254 as we 
need a byte for size, and another for capacity.
This will increase the size of the values a bit for sizes from 7 to 15, but 
save a lot of memory for sizes 17+, as keys could still be shared.

Pros:
    No need to materialize dicts of size 16+, saving ~3/4 of the memory per 
dict and helping specialization.

Cons:
    Extra memory write to store a value*
    1 extra word for values of size 7 to 14, 2 extra for size 15.
    Some extra complexity.

*In a hypothetical optimized JIT, the insertion order vector would be stored as 
a single write for several writes, so this would make no difference.

----------
assignee: Mark.Shannon
messages: 412735
nosy: Mark.Shannon
priority: normal
severity: normal
status: open
title: Allow more than 16 items in split-keys dicts and "virtual" object dicts.
versions: Python 3.11

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue46675>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to