New submission from Chad Netzer <chad.net...@gmail.com>:
In Python v3.9+ there was a regression in the amount of used memory for list-literals, due to switching to using list_extend() to allocate memory for the new list to accomodate the literal elements. Example, in Python v3.8.x (and before): ``` $ python38 Python 3.8.5 (default, Sep 4 2020, 02:22:02) >>> [1].__sizeof__() 48 >>> [1,2].__sizeof__() 56 >>> [1,2,3].__sizeof__() 64 ``` whereas for v3.9 (and later): ``` $ python39 Python 3.9.2 (default, Feb 19 2021, 17:09:53) >>> [1].__sizeof__() 48 >>> [1,2].__sizeof__() 56 >>> [1,2,3].__sizeof__() 104 # a 60% increase in memory allocated ``` However, this seems like an unintented regression, and is a side-effect of the new way of building the lists from literals, using the list_extend() function (via list_resize(), which overallocates). In particular, a consequence is that making a copy of the list that's initialized from a literal can end up using less memory: ``` $ python39 Python 3.9.2 (default, Feb 19 2021, 17:09:53) >>> a = [1,2,3] >>> b = list(a) # Same behavior if list.copy() or slice copy is performed >>> a.__sizeof__() 104 >>> b.__sizeof__() 64 ``` Prior to v3.9, the byte-code for making a list from a literal had the "BUILD_LIST" opcode with an explicit length argument, allowing allocation of the exact amount of memory needed for the literal. As of v3.9, the LIST_EXTEND opcode is used, instead. I believe the simplest way of restoring the old behavior is to change list_extend() to not overallocate when the list being extended currently has 0 elements. Ie. a minimal-change patch to restore the previous behavior (though with a side-effect of removing the overallocaton of a list that is initialzed empty, and then immediately extended): diff --git a/Objects/listobject.c b/Objects/listobject.c index e7987a6d35..7820e033af 100644 --- a/Objects/listobject.c +++ b/Objects/listobject.c @@ -75,8 +75,9 @@ list_resize(PyListObject *self, Py_ssize_t newsize) if (newsize - Py_SIZE(self) > (Py_ssize_t)(new_allocated - newsize)) new_allocated = ((size_t)newsize + 3) & ~(size_t)3; - if (newsize == 0) - new_allocated = 0; + /* Don't overallocate for lists that start empty or are set to empty. */ + if (newsize == 0 || Py_SIZE(self) == 0) + new_allocated = newsize; num_allocated_bytes = new_allocated * sizeof(PyObject *); items = (PyObject **)PyMem_Realloc(self->ob_item, num_allocated_bytes); if (items == NULL) { Relevant/related bugs/PRs: # Switched to initializing list literals w/ LIST_EXTEND https://bugs.python.org/issue39320 https://github.com/python/cpython/pull/17984 # Commit where over-allocation of list literals first appeared https://bugs.python.org/issue38328 https://github.com/python/cpython/pull/17114 https://github.com/python/cpython/commit/6dd9b64770af8905bef293c81d541eaaf8d8df52 https://bugs.python.org/issue38373 https://github.com/python/cpython/pull/18952 https://github.com/python/cpython/commit/2fe815edd6778fb9deef8f8044848647659c2eb8 ---------- components: Interpreter Core messages: 389207 nosy: Chad.Netzer priority: normal severity: normal status: open title: Regression in overallocation for literal list initialization in v3.9+ type: resource usage versions: Python 3.10, Python 3.9 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue43574> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com