Re: [Python-Dev] Changing the order of iteration over a dictionary
On Fri, Jan 20, 2012 at 8:49 PM, Mark Shannon wrote: > So, don't be afraid to change that hash function :) Changing it for 3.3 isn't really raising major concerns: the real concern is with changing it in maintenance and security patches for earlier releases. Security patches that may break production applications aren't desirable, since it means admins have to weigh up the risk of being affected by the security vulnerability against the risk of breakage from the patch itself. The collision counting approach was attractive because it looked like it might offer a way out that was less likely to break deployed systems. Unfortunately, I think the point Martin raised about just opening a new (even more subtle) attack vector kills that idea dead. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Changing the order of iteration over a dictionary
On Fri, Jan 20, 2012 at 5:49 AM, Mark Shannon wrote: > So, don't be afraid to change that hash function :) Definitely. The hash function *has* been changed in the past, and a lot of developers were schooled in not relying on the iteration order. That's a good thing, as those developers now write tests of what's actually important rather than relying on implementation details of the Python runtime. A hash function that changes more often than during an occasional major version update will encourage more developers to write better tests. We can think of it as an educational tool. -Fred -- Fred L. Drake, Jr. "A person who won't read has no advantage over one who can't read." --Samuel Langhorne Clemens ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Changing the order of iteration over a dictionary
Hi, One of the main sticking points over possible fixes for the hash-collision security issue seems to be a fear that changing the iteration order of a dictionary will break backwards compatibility. The order of iteration has never been specified. In fact not only is it arbitrary, it cannot be determined from the contents of a dict alone; it may depend on the insertion order. Changing a hash function is not the only change that will change the iteration order; any of the following will also do so: * Changing the minimum size of a dict. * Changing the load factor of a dict. * Changing the resizing policy of a dict. * Sharing of keys between dicts. By treating iteration order as part of the API we are effectively ruling out ever making any improvements to the dict. For example, my new dictionary implementation https://bitbucket.org/markshannon/hotpy_new_dict/ reduces memory use by 47% for gcbench, and by about 20% for the 2to3 benchmark, on my 32bit machine. (Nice graphs: http://tinyurl.com/7qd2nnm http://tinyurl.com/6uqvl2x ) The new dict implementation (necessarily) changes the iteration order and will break code that relies on it. If dict iteration order is to be treated as part of the API (and I think that is a very bad idea) then it should be documented, which will be difficult since it is barely deterministic. This will also be a major problem for PyPy, Jython and IronPython, as they will have to reimplement their dicts. So, don't be afraid to change that hash function :) Cheers, Mark ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com