On Thu, 29 Dec 2011 14:32:21 +0100 Christian Heimes <li...@cheimes.de> wrote: > Am 29.12.2011 13:57, schrieb Armin Ronacher: > > Hi, > > > > Something I should add to this now that I thought about it a bit more: > > > > Assuming this should be fixed on a language level the solution would > > probably be to salt hashes. The most common hash to salt here is the > > PyUnicode hash for obvious reasons. > > > > - Option a: Compiled in Salt > > + Easy to implement > > - Breaks unittests most likely (those were broken in the first place > > but that's still a very annoying change to make) > > - Might cause problems with interoperability of Pythons compiled with > > different hash salts > > - You're not really solving the problem because each linux > > distribution (besides Gentoo I guess) would have just one salt > > compiled in and that would be popular enough to have the same > > issue. > > > > - Option b: Environment variable for the salt > > + Easy-ish to implement > > + Easy to synchronize over different machines > > - initialization for base types happens early and unpredictive which > > makes it hard for embedded Python interpreters (think mod_wsgi and > > other things) to specify the salt > > > > - Option c: Random salt at runtime > > + Easy to implement > > - impossible to synchronize > > - breaks unittests in the same way as a compiled in salt would do > > - Option d: Don't change __hash__ but only use randomized hash for > PyDictEntry lookup > + Easy to implement > - breaks only software to relies on a fixed order of dict keys > - breaks only a few to no unit tests > > IMHO we don't have to alter the outcome of hash("some string"), hash(1) > and all other related types. We just need to reduce the change the an > attacker can produce collisions in the dict (and set?) code that looks > up the slot (PyDictEntry). How about adding the random value in > Object/dictobject.c:lookdict() and lookdict_str() (Python 2.x) / > lookdict_unicode() (Python 3.x)? With this approach the hash of all our > objects stay the same and just the dict code needs to be altered. The > approach has also the benefit that all possible objects gain a > randomized hash.
I basically agree with your proposal. The only downside is that custom hashed containers (such as _pickle.c's memotable) don't automatically benefit. That said, I think it would be difficult to craft an attack against the aforementioned memotable (you would have to basically choose the addresses of pickled objects). Regards Antoine. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com