Gregory P. Smith <g...@krypto.org> added the comment: On Wed, Jan 18, 2012 at 1:10 PM, Guido van Rossum <rep...@bugs.python.org> wrote: > On Wed, Jan 18, 2012 at 1:05 PM, Antoine Pitrou <rep...@bugs.python.org>wrote: > > > > I would hope 3.3 only gets randomized hashing. Collision counting is a > > hack to make bugfix releases 99.999%-compatible instead of 99.9% ;) > > > > Really? I'd expect the difference to be more than 2 nines. The randomized > hashing has two problems: (a) change in dict order; (b) hash varies between > processes. I cannot imagine counterexamples to the collision counting that > weren't constructed specifically as counterexamples.
For the purposes of 3.3 I'd prefer to just have randomized hashing and not the collision counting in order to keep things from getting too complicated. But I will not object if we opt to do both. As much as the counting idea rubs me wrong, even if it were on by default I agree that most non-contrived things will never encounter it and it is easy to document how to work around it by disabling it should anyone actually be impeded by it. The concern I have with that approach from a web service point of view is that it too can be gamed in the more rare server situation of someone managing to fill a persistent data structure up with enough colliding values to be _close_ to the limit such that the application then dies while trying to process all future requests that _hit_ the limit (a persisting 500 error DOS rather than an exception raised only in one offending request that deserved that 500 error anyways). Not nearly as likely a scenario but it is one I'd keep an eye open for with an attacker hat on. MvL's suggestion of using AVL trees for hash bucket slots instead of our linear slot finding algorithm is a better way to fix the ultimate problem by never devolving into linear behavior at all. It is naturally more complicated but could likely even be done while maintaining ABI compatibility. I haven't pondered designs and performance costs for that. Possibly a memory hit and one or two extra indirect lookups in the normal case and some additional complexity in the iteration case. -gps ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue13703> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com