On Fri, Jan 13, 2012 at 5:58 PM, Gregory P. Smith <g...@krypto.org> wrote:
> > On Fri, Jan 13, 2012 at 5:38 PM, Guido van Rossum <gu...@python.org>wrote: > >> On Fri, Jan 13, 2012 at 5:17 PM, Antoine Pitrou <solip...@pitrou.net>wrote: >> >>> On Thu, 12 Jan 2012 18:57:42 -0800 >>> Guido van Rossum <gu...@python.org> wrote: >>> > Hm... I started out as a big fan of the randomized hash, but thinking >>> more >>> > about it, I actually believe that the chances of some legitimate app >>> having >>> > >1000 collisions are way smaller than the chances that somebody's code >>> will >>> > break due to the variable hashing. >>> >>> Breaking due to variable hashing is deterministic: you notice it as >>> soon as you upgrade (and then you use PYTHONHASHSEED to disable >>> variable hashing). That seems better than unpredictable breaking when >>> some legitimate collision chain happens. >> >> >> Fair enough. But I'm now uncomfortable with turning this on for bugfix >> releases. I'm fine with making this the default in 3.3, just not in 3.2, >> 3.1 or 2.x -- it will break too much code and organizations will have to >> roll back the release or do extensive testing before installing a bugfix >> release -- exactly what we *don't* want for those. >> >> FWIW, I don't believe in the SafeDict solution -- you never know which >> dicts you have to change. >> >> > Agreed. > > Of the three options Victor listed only one is good. > > I don't like *SafeDict*. *-1*. It puts the onerous on the coder to > always get everything right with regards to data that came from outside the > process never ending up hashed in a non-safe dict or set *anywhere*. > "Safe" needs to be the default option for all hash tables. > > I don't like the "*too many hash collisions*" exception. *-1*. It > provides non-deterministic application behavior for data driven > applications with no way for them to predict when it'll happen or where and > prepare for it. It may work in practice for many applications but is simply > odd behavior. > > I do like *randomly seeding the hash*. *+1*. This is easy. It can easily > be back ported to any Python version. > > It is perfectly okay to break existing users who had anything depending on > ordering of internal hash tables. Their code was already broken. We > *will*provide a flag and/or environment variable that can be set to turn the > feature off at their own peril which they can use in their test harnesses > that are stupid enough to use doctests with order dependencies. > What an implementation looks like: http://pastebin.com/9ydETTag some stuff to be filled in, but this is all that is really required. add logic to allow a particular seed to be specified or forced to 0 from the command line or environment. add the logic to grab random bytes. add the autoconf glue to disable it. done. -gps > This approach worked fine for Perl 9 years ago. > https://rt.perl.org/rt3//Public/Bug/Display.html?id=22371 > > -gps >
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com