Re: [Python-Dev] Hash collision security issue (now public)

2011-12-29 Thread Fred Drake
On Thu, Dec 29, 2011 at 8:19 AM, Christian Heimes li...@cheimes.de wrote: Persistence layers like ZODB and cross interpreter communication channels used by multiprocessing may (!) rely on the fact that the hash of a string is fixed. ZODB does not rely on a fixed hash function for strings; for

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-29 Thread Armin Ronacher
Hi, Just some extra thoughts about the whole topic in the light of web applications (since this was hinted in the talk) running on Python: Yes, you can limit the number of maximum allowed parameters for post data but really there are so many places where data is parsed into hashing

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-29 Thread Armin Ronacher
Hi, Something I should add to this now that I thought about it a bit more: Assuming this should be fixed on a language level the solution would probably be to salt hashes. The most common hash to salt here is the PyUnicode hash for obvious reasons. - Option a: Compiled in Salt + Easy to

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-29 Thread Ned Batchelder
On 12/28/2011 9:09 PM, Raymond Hettinger wrote: Also, randomizing the hash wreaks havoc on doctests, book examples not matching actual dict reprs, and on efforts by users to optimize the insertion order into dicts with frequent lookups. I don't have a strong opinion about what to do about this

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-29 Thread Mark Shannon
Michael Foord wrote: Hello all, A paper (well, presentation) has been published highlighting security problems with the hashing algorithm (exploiting collisions) in many programming languages Python included:

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-29 Thread geremy condra
On Wed, Dec 28, 2011 at 8:49 PM, Eric Snow ericsnowcurren...@gmail.com wrote: On Wed, Dec 28, 2011 at 6:28 PM, Michael Foord fuzzy...@voidspace.org.uk wrote: Hello all, A paper (well, presentation) has been published highlighting security problems with the hashing algorithm (exploiting

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-29 Thread Antoine Pitrou
On Thu, 29 Dec 2011 03:55:22 +0100 Christian Heimes li...@cheimes.de wrote: I've been dealing with web stuff and security for almost a decade. I've seen far worse attack vectors. This one can easily be solved with a couple of lines of Python code. For example Application developers can limit

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-29 Thread Antoine Pitrou
On Thu, 29 Dec 2011 04:04:17 +0100 Christian Heimes li...@cheimes.de wrote: Am 29.12.2011 02:37, schrieb Jesse Noller: Back up link for the PDF: http://dl.dropbox.com/u/1374/2007_28C3_Effective_DoS_on_web_application_platforms.pdf Ocert disclosure:

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-29 Thread Mark Shannon
Raymond Hettinger wrote: FWIW, Uncle Timmy considers the non-randomized hashes to be a virtue. It is believed that they give us better-than-random results for commonly encountered datasets. A change to randomized hashes would have a negative performance impact on those cases. Tim Peter's

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-29 Thread Antoine Pitrou
On Thu, 29 Dec 2011 14:32:21 +0100 Christian Heimes li...@cheimes.de wrote: Am 29.12.2011 13:57, schrieb Armin Ronacher: Hi, Something I should add to this now that I thought about it a bit more: Assuming this should be fixed on a language level the solution would probably be to

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-29 Thread Hynek Schlawack
Hi, how about Option d: Host based salt + Easy-ish to implement – how about basing it on the hostname for example? + transparent for all processes on the same host - probably unit test breakage In fact, we could use host based as default with the option to specify own which would solve

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-29 Thread Antoine Pitrou
On Thu, 29 Dec 2011 11:25:03 + Mark Shannon m...@hotpy.org wrote: Also, randomizing the hash wreaks havoc on doctests, book examples not matching actual dict reprs, and on efforts by users to optimize the insertion order into dicts with frequent lookups. The docs clearly state that

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-29 Thread Christian Heimes
Am 29.12.2011 12:13, schrieb Mark Shannon: The attack relies on being able to predict the hash value for a given string. Randomising the string hash function is quite straightforward. There is no need to change the dictionary code. A possible (*untested*) patch is attached. I'll leave it for

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-29 Thread Christian Heimes
Am 29.12.2011 12:10, schrieb Antoine Pitrou: I've been dealing with web stuff and security for almost a decade. I've seen far worse attack vectors. This one can easily be solved with a couple of lines of Python code. For example Application developers can limit the maximum amount of POST

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-29 Thread M.-A. Lemburg
Mark Shannon wrote: Michael Foord wrote: Hello all, A paper (well, presentation) has been published highlighting security problems with the hashing algorithm (exploiting collisions) in many programming languages Python included:

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-29 Thread Antoine Pitrou
On Thu, 29 Dec 2011 13:57:07 +0100 Armin Ronacher armin.ronac...@active-4.com wrote: - Option c: Random salt at runtime + Easy to implement - impossible to synchronize - breaks unittests in the same way as a compiled in salt would do This option would have my preference. I don't think

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-29 Thread Christian Heimes
Am 29.12.2011 11:32, schrieb Antoine Pitrou: If I remember correctly CPython uses the long for its hash function so 64bit Windows uses a 32bit hash. Not anymore, Py_hash_t is currently aligned with Py_ssize_t. Thanks for the update. Python 2.x still uses long and several large frameworks

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-29 Thread Christian Heimes
Am 29.12.2011 13:57, schrieb Armin Ronacher: Hi, Something I should add to this now that I thought about it a bit more: Assuming this should be fixed on a language level the solution would probably be to salt hashes. The most common hash to salt here is the PyUnicode hash for obvious

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-29 Thread Tim Delaney
+1 to option d (host-based salt) but would need to consistently order the hostnames/addresses to guarantee that all processes on the same machine got the same salt by default. +1 to option c (environment variable) as an override. And/or maybe an override on the command line. +1 to implementing

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-29 Thread Tim Delaney
+1 to option c (environment variable) as an override. And/or maybe an override on the command line. That obviously should have said option b (environment variable) ... Tim Delaney ___ Python-Dev mailing list Python-Dev@python.org

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-29 Thread PJ Eby
On Thu, Dec 29, 2011 at 8:32 AM, Christian Heimes li...@cheimes.de wrote: IMHO we don't have to alter the outcome of hash(some string), hash(1) and all other related types. We just need to reduce the change the an attacker can produce collisions in the dict (and set?) code that looks up the

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-29 Thread Paul McMillan
It's worth pointing out that if the salt is somehow exposed to an attacker, or is guessable, much of the benefit goes away. It's likely that a timing attack could be used to discover the salt if it is fixed per machine or process over a long period of time. If a salt is generally fixed per

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-29 Thread Christian Heimes
Am 29.12.2011 21:07, schrieb PJ Eby: I don't understand how that helps a collision attack. If you can still generate two strings with the same (pre-randomized) hash, what difference does it make that the dict adds a random number? The post-randomized number will still be the same, no? Or

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-29 Thread Terry Reedy
The talk was a presentation yesterday by Alexander Klink and Julian Wälde at the Chaos Communication Congress in Germany hash...@alech.de I read the non-technical summary at http://arstechnica.com/business/news/2011/12/huge-portions-of-web-vulnerable-to-hashing-denial-of-service-attack.ars

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-29 Thread Terry Reedy
On 12/29/2011 4:31 PM, Christian Heimes wrote: The hash randomization idea adds a salt to throw the attacker of course. Instead of position = hash mask it's now hash = salt + hash As I understood the talk (actually, the bit of Perl interpreter C code shown), the randomization is to

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-29 Thread Christian Heimes
Am 29.12.2011 23:28, schrieb Terry Reedy: As I understood the talk (actually, the bit of Perl interpreter C code shown), the randomization is to change hash(s) to hash(salt+s) so that the salt is completely mixed into the hash from the beginning, rather than just tacked on at the end. Yes,

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-29 Thread Tim Delaney
On 30 December 2011 06:59, Tim Delaney timothy.c.dela...@gmail.com wrote: +0 to exposing the salt as a constant (3.3+ only) - or alternatively expose a hash function that just takes an existing hash and returns the salted hash. That would make it very easy for anything that wanted a salted