New submission from INADA Naoki <songofaca...@gmail.com>:

GROWTH_RATE is changed from (used*2) to (used*2 + dk_size/2) in #17563, at 
Python 3.4.
It was for avoid resizing dict for massive del/insert use case, by increasing 
possibility of overwriting DUMMY entry.

>From Python 3.6, there are no DUMMY entry.  While there are dummy keys, we 
>resize (repack) when entries are full.
So there are no benefit from "possibility of overwriting dummy entry".

(Of course, there are still benefit from slow repacking rate for new dict.
So I don't propose to change it back to (used*2), but (used*3).)


This GROWTH_RATE prevents dict is shrinked in insertion_resize().

For example, consider this dict:

>>> d = dict.fromkeys(range(10900))
>>> len(d)
10900
>>> sys.getsizeof(d)
295008
>>> for i in range(10900):
...     del d[i]
...
>>> len(d)
0
>>> sys.getsizeof(d)
295008

`del d[i]` doesn't shrink the dict.  This is another issue (#32623).

Current dk_size is 16384 and entries length is dk_size * 2 // 3 = 10922.
So dictresize will called when next 923 entries are added.

New dk_size is round_up_to_power_of_two(922 + 16384/2) = 16384.
So dict is not shrinked!

>>> for i in range(923):
...     d[i] = i
...
>>> sys.getsizeof(d)
295008

round_up_to_power_of_two(used + dk_size/2) means dict is shrinked only when 
used == 0.

I propose changing GROWTH_RATE again to `used*3` from `used*2 + dk_size/2`

In case of dict growing without deletion, dk_size is doubled for each resize as 
current behavior.
When there are deletion, dk_size is growing aggressively than Python 3.3 
(used*2 -> used*3).  And it allows dict shrinking after massive deletions.

>>> import sys
>>> d = dict.fromkeys(range(10900))
>>> sys.getsizeof(d)
295008
>>> for i in range(10900):
...     del d[i]
...
>>> len(d)
0
>>> sys.getsizeof(d)
295008
>>> for i in range(923):
...     d[i] = i
...
>>> sys.getsizeof(d)
36968


I want to backport this change to Python 3.7.  While it's beta3, "dict can't be 
shrinked unless empty" behavior looks resource usage bug to me.

----------
components: Interpreter Core
messages: 314806
nosy: Mark.Shannon, Yury.Selivanov, eric.snow, inada.naoki, rhettinger, vstinner
priority: normal
severity: normal
status: open
title: GROWTH_RATE prevents dict shrinking
type: resource usage
versions: Python 3.7, Python 3.8

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue33205>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to