That seems pretty clear — presumably it follows the lead of frozenset and tuple.
On Tue, Jul 21, 2020 at 21:45 Todd <toddr...@gmail.com> wrote: > What, exactly, is frozen? My understanding is that one problem with > frozen dicts in the past is deciding exactly what is mutable and what is > immutable. Can you change what object a key maps to so long as the set of > keys stay the same? Can you modify the contents of mutable object that is > a value? > > On Tue, Jul 21, 2020 at 6:30 PM Marco Sulla <marco.sulla.pyt...@gmail.com> > wrote: > >> Let me first say that the code and discussion, as per title, is also >> about possible performance improvements to the dict base type. >> >> TL;DR I implemented a frozendict using CPython 3.9 code. It seems that an >> immutable dict *could* be faster than dict in some cases. Furthermore, some >> optimization could be applied to dict too. >> >> Long explaining: >> >> Since now I have some time, I decided to experiment a little with the >> creation of an immutable dict in Python. >> >> Unluckily, I started this experiment many months ago, so the CPython code >> I used is old. Maybe some or all of my considerations are outdated. >> >> Initially, I wrote a quick and dirty implementation: >> >> https://github.com/Marco-Sulla/cpython/commit/fde4e6d236b19636063f8afedea8c50278205334 >> >> The code was very simple, and the performance was identical to dict. So >> in theory, adding a frozendict to CPython is not hard. But there's more. >> >> After the first implementation, I started to try to improve the >> performance of frozendict. The result of the improvements are here: >> >> >> https://github.com/Marco-Sulla/cpython/blob/master/frozendict/test/bench.txt >> >> For benchmarks, I used simply timeit, with autorange and repeat and, as >> suggested in the module documentation, I got the minimum of the results. >> Here is the code: >> >> >> https://github.com/Marco-Sulla/cpython/blob/master/frozendict/test/bench.py >> >> I have not tested with an optimized build, since optimization data is >> collected using the unit tests, and I didn't write tests for frozendict in >> the official CPython test framework. >> >> The tests and benchmarks were done using CPython 3.9a0. CPU and other pc >> resources were not restricted using pyperf or similar tools, to see the >> "real" speed. CPython was compiled using gcc and g++. >> >> In benchmarks, I compared methods and operators using dict and >> frozendict. The benchmark uses dicts with all integers and dicts with all >> strings. Furthermore, I tested dicts with size 8 (the most optimized size >> in CPython) and 1000 elements (maybe too much, but I wanted to see how they >> perform with a high number of elements). >> >> Every benchmark has a line in the output. The Name describes the >> benchmarked method, operator or code snippet. Size is the size of the dict, >> 8 or 1000. Keys are the keys type, str or int. Type is the dictionary type, >> dict or frozendict. >> >> In Name, the "o" represents the object itself (dict or frozendict). "d", >> in benchmark with dict, is "o"; in benchmarks with frozendict is an >> equivalent instance of type dict. >> >> Some consideration: >> >> 1. frozendict is very fast, as expected, at copying. But it's also faster >> at creation, using a (not frozen) dict, kwargs or a sequence2. Speedups >> range from 20% to 45%. >> 2. frozendict is also a bit faster when you iterate over it, especially >> over values, where is ~15% faster >> 3. hash seems really fast, but this is because it's cached the first time >> hash() is invoked >> 4. where frozendict is slower is when you unpickle it and with fromkeys >> and repr. This is because I wrote a very naif implementation of these >> methods, without optimizing them. The other methods have a comparable speed. >> >> Here is the code: >> https://github.com/Marco-Sulla/cpython >> >> Here is the diff between the CPython code and my commits: >> https://github.com/python/cpython/compare/master...Marco-Sulla:master >> >> About code >> >> I coded the implementation doing a simple copy/paste of the existing dict >> functions, modifying their code and renaming them. This way I'm sure dict >> continues to work as before, and I can compare the speed gain. >> >> Some of the optimization I adopted can be implemented in `dict` too. For >> example, instead of creating an empty dict and resizing it, I create it >> with the "maximum" size and I fill it. It seems to work, even if I did not >> explore the possibility that a mutable object can change while a frozendict >> creation is in progress. >> >> Some problems with optimizing dict and maintaining a frozendict: >> >> 1. duplication of code. To gain a significant boost, I had to copy and >> paste a lot of code. Functions can be remerged again, but maybe the speedup >> will be reduced. >> 2. split vs combined dicts. As I wrote, split dicts seem to be faster in >> reading than combined dicts. For example, iterating over values is faster >> with a split dict, as expected. >> But writing was not tested; furthermore, some of the optimizations can be >> adopted for dicts too, so the convenience of a split table can be lowered. >> dict continues to maintain both split and combined tables, so this could >> be not a problem. But the code could be less and more fast if only a table >> layout is supported >> 3. the CPython code I used is old, so some of the improvements I adopted >> could be already implemented >> >> About frozendict >> >> Apart the considerations done in the [PEP 416]( >> https://www.python.org/dev/peps/pep-0416/), that was rejected since >> there was little gain from its implementation, I think that frozendict can >> be useful as a substitute of MappingProxyType, that is really slow. >> MappingProxyType is not much used, but it's present in critical parts of >> CPython code, for example in _sre. We have to see if a mapping proxy type >> *can* be substituted with an immutable map in some critical part of CPython >> code. >> >> Furthermore, frozendicts could be used for implementing "immutable" >> classes and modules, and can be used as a faster dict if its content does >> not change. >> _______________________________________________ >> Python-ideas mailing list -- python-ideas@python.org >> To unsubscribe send an email to python-ideas-le...@python.org >> https://mail.python.org/mailman3/lists/python-ideas.python.org/ >> Message archived at >> https://mail.python.org/archives/list/python-ideas@python.org/message/K7CRVW6O7RO6DT3JIG3OAJCAVCA5CNTN/ >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > _______________________________________________ > Python-ideas mailing list -- python-ideas@python.org > To unsubscribe send an email to python-ideas-le...@python.org > https://mail.python.org/mailman3/lists/python-ideas.python.org/ > Message archived at > https://mail.python.org/archives/list/python-ideas@python.org/message/GAJHZEBOU5BGYJAUNQP5V66KSS6HHSQD/ > Code of Conduct: http://python.org/psf/codeofconduct/ > -- --Guido (mobile)
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/YPQRFJIHHQQAVYN5AZ53HONUJJRKEV5A/ Code of Conduct: http://python.org/psf/codeofconduct/