[issue32623] Resize dict on del/pop

2020-08-04 Thread STINNER Victor


Change by STINNER Victor :


--
nosy:  -vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32623] Resize dict on del/pop

2020-08-03 Thread Raymond Hettinger


Change by Raymond Hettinger :


--
nosy: +rhettinger, tim.peters

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32623] Resize dict on del/pop

2020-08-03 Thread Guido van Rossum


Change by Guido van Rossum :


--
nosy: +gvanrossum

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32623] Resize dict on del/pop

2018-04-02 Thread INADA Naoki

Change by INADA Naoki :


--
dependencies: +GROWTH_RATE prevents dict shrinking

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32623] Resize dict on del/pop

2018-01-30 Thread Xavier G. Domingo

Change by Xavier G. Domingo :


--
nosy: +xgdomingo

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32623] Resize dict on del/pop

2018-01-24 Thread INADA Naoki

INADA Naoki  added the comment:

I think I understand #17563, and I should fix GROWTH_RATE.

Before compact-ordered dict, we can avoid resizing in
"the number of deletions is on a par with the number of insertions."
scenario, by large GROWTH_RATE.
That's because new entry can reuse dummy entries.

But in compact-ordered dict, we can't do that.
We need resizing always, and resize is much faster than legacy dict.

I think GROWTH_RATE should be ma_used*3 for now.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32623] Resize dict on del/pop

2018-01-24 Thread INADA Naoki

INADA Naoki  added the comment:

> Should we make sure that all dicts have at least MINSIZE entries?

I don't think so.  I think "allocate on first insert" is good idea for 
dict.clear()
And I think it's good idea for new dict too:
https://github.com/python/cpython/pull/1080

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32623] Resize dict on del/pop

2018-01-24 Thread STINNER Victor

STINNER Victor  added the comment:

"This will cause significant performance regression for `dict[a]=None;
del dict[a]` loop. del/pop shouldn't do clear()."

Should we make sure that all dicts have at least MINSIZE entries?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32623] Resize dict on del/pop

2018-01-24 Thread INADA Naoki

INADA Naoki  added the comment:

> * When dict size become 0, make the dict shared-empty, like dict.clear()

This will cause significant performance regression for `dict[a]=None; del 
dict[a]` loop.
del/pop shouldn't do clear().

> * When (dict size < dk_size/8), call insertion_resize()

This is bad too.
When ma_used=127 and dk_size=1024, new size will be 1024!
It's because current GROWTH_RATE is used*2 + size/2.

This GROWTH_RATE is set in issue17563.
We should understand it before changing anything.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32623] Resize dict on del/pop

2018-01-24 Thread INADA Naoki

Change by INADA Naoki :


--
pull_requests: +5147

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32623] Resize dict on del/pop

2018-01-24 Thread INADA Naoki

Change by INADA Naoki :


--
keywords: +patch
pull_requests: +5144
stage:  -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32623] Resize dict on del/pop

2018-01-24 Thread STINNER Victor

STINNER Victor  added the comment:

This issue is a matter of CPU vs memory tradeoff. It reminds me the PEP 393: 
str type doesn't make tradeoff anymore, CPython guarantee that str always use 
the most efficient storage (least memory) for characters.

But here the tradeoff is different. A dict is mutable, whereas str is immutable 
;-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32623] Resize dict on del/pop

2018-01-24 Thread STINNER Victor

STINNER Victor  added the comment:

I agree that an heuristic is needed to decide when a dict should be compacted.

> * When (dict size < dk_size/8), call insertion_resize()

In bpo-31179, I suggested to Yury to use 2/3 ratio... to avoid integer overflow 
:-) He first used 80%, but I dislike using the FPU in the dictobject.c. I'm not 
sure of the cost of switching from integers to floats, and more generally I 
hate rounding issues, so I prefer to use regular integers ;-)

+   (3) if 'mp' is non-compact ('del' operation does not resize dicts),
+   do fast-copy only if it has at most 1/3 non-used keys.
+
+   The last condition (3) is important to guard against a pathalogical
+   case when a large dict is almost emptied with multiple del/pop
+   operations and copied after that.  In cases like this, we defer to
+   PyDict_Merge, which produces a compacted copy.

By the way, if dict automatically compacts itself automatically, do we still 
need Yury's test "is the dict compact"?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32623] Resize dict on del/pop

2018-01-23 Thread INADA Naoki

INADA Naoki  added the comment:

For insert/pop loop, reduce table size aggressively on pop may cause
performance regression.  So reducing size should be conservative.

So my opinion is:

* When dict size become 0, make the dict shared-empty, like dict.clear()
* When (dict size < dk_size/8), call insertion_resize()

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32623] Resize dict on del/pop

2018-01-22 Thread STINNER Victor

STINNER Victor  added the comment:

Note: It was recently discussed if "del dict[key]" should keep the insertion 
order. If I understood correctly: yes, the order must be preserved on deletion.

https://mail.python.org/pipermail/python-dev/2017-November/150142.html

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32623] Resize dict on del/pop

2018-01-22 Thread Yury Selivanov

New submission from Yury Selivanov :

We should investigate whether we want dicts to compact themselves on del/pop 
operations.  Currently we have to rely on workarounds to have compactable 
dict.copy (see issue 31179) etc.

--
components: Interpreter Core
messages: 310427
nosy: inada.naoki, serhiy.storchaka, vstinner, yselivanov
priority: normal
severity: normal
status: open
title: Resize dict on del/pop
type: enhancement
versions: Python 3.8

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com