[issue13703] Hash collision security issue

2021-11-08 Thread STINNER Victor


Change by STINNER Victor :


--
nosy:  -vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2021-11-04 Thread Terry J. Reedy


Terry J. Reedy  added the comment:

Because today's spammer, whose message was removed, deleted us all.  Restoring 
the version to 3.3 is not possible.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2021-11-04 Thread Guido van Rossum


Guido van Rossum  added the comment:

Hey Erlend, why did you add so many people to the nosy list of this old
issue?

On Thu, Nov 4, 2021 at 07:33 Erlend E. Aasland 
wrote:

>
> Change by Erlend E. Aasland :
>
>
> --
> components: +Interpreter Core -Argument Clinic
> nosy: +Arach, Arfrever, Huzaifa.Sidhpurwala, Jim.Jewett, Mark.Shannon,
> PaulMcMillan, Zhiping.Deng, alex, barry, benjamin.peterson,
> christian.heimes, cvrebert, dmalcolm, eric.araujo, eric.snow, fx5,
> georg.brandl, grahamd, gregory.p.smith, gvanrossum, gz, jcea, jsvaughan,
> lemburg, loewis, mark.dickinson, neologix, pitrou, python-dev, roger.serwy,
> skorgu, skrah, terry.reedy, tim.peters, v+python, vstinner, zbysz
> -ahmedsayeed1982, larry
>
> ___
> Python tracker 
> 
> ___
>
-- 
--Guido (mobile)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2021-11-04 Thread Erlend E. Aasland


Change by Erlend E. Aasland :


--
components: +Interpreter Core -Argument Clinic
nosy: +Arach, Arfrever, Huzaifa.Sidhpurwala, Jim.Jewett, Mark.Shannon, 
PaulMcMillan, Zhiping.Deng, alex, barry, benjamin.peterson, christian.heimes, 
cvrebert, dmalcolm, eric.araujo, eric.snow, fx5, georg.brandl, grahamd, 
gregory.p.smith, gvanrossum, gz, jcea, jsvaughan, lemburg, loewis, 
mark.dickinson, neologix, pitrou, python-dev, roger.serwy, skorgu, skrah, 
terry.reedy, tim.peters, v+python, vstinner, zbysz -ahmedsayeed1982, larry

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2021-11-04 Thread Erlend E. Aasland


Change by Erlend E. Aasland :


--
Removed message: https://bugs.python.org/msg405707

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2021-11-04 Thread Ahmed Sayeed


Ahmed Sayeed  added the comment:

In collect_register() function of arc-linux-tdep.c, the "eret" 
http://www-look-4.com/travel/london/
(exception return) register value is not being reported correctly.

Background: https://komiya-dental.com/shopping/buy-android/
When asked for the "pc" value, we have to update the "eret" register
with GDB's STOP_PC.  The "eret" instructs the kernel code where to
jump back http://www.iu-bloomington.com/shopping/hatchback-cars/ when an 
instruction has stopped due to a breakpoint.  This
is how collect_register() is doing so: 
https://waytowhatsnext.com/shopping/xbox-release-date/

--8<--
  if (regnum == gdbarch_pc_regnum (gdbarch)) 
http://www.wearelondonmade.com/travel/london/
regnum = ARC_ERET_REGNUM;
  regcache->raw_collect (regnum, buf + arc_linux_core_reg_offsets[regnum]);
-->8-- http://www.jopspeech.com/travel/london/

Root cause:
Although this is using the correct offset (ERET register's), it is also 
http://joerg.li/travel/london/ 
changing the REGNUM itself.  Therefore, raw_collect (regnum, ...) is
not reading from "pc" anymore. http://connstr.net/travel/london/

Consequence:
This bug affects the "native ARC gdb" badly and causes kernel code to jump
to addresses after the breakpoint and not executing the "breakpoint"ed 
http://embermanchester.uk/travel/london/ 
instructions at all.  That "native ARC gdb" feature is not upstream yet and
is in review at the time of writing [1]. 
http://www.slipstone.co.uk/travel/london/
In collect_register() function of arc-linux-tdep.c, the "eret"
(exception return) register value is not being reported correctly. 
http://www.logoarts.co.uk/travel/london/

Background:
When asked for the "pc" value, we have to update the "eret" register
with GDB's STOP_PC. http://www.acpirateradio.co.uk/travel/good/  The "eret" 
instructs the kernel code where to
jump back when an instruction has stopped due to a breakpoint.  This
is how collect_register() is doing so:
http://www.compilatori.com/travel/london/
--8<--
  if (regnum == gdbarch_pc_regnum (gdbarch))
regnum = ARC_ERET_REGNUM;
  regcache->raw_collect (regnum, buf + arc_linux_core_reg_offsets[regnum]);
-->8--

Root cause: https://www.webb-dev.co.uk/shopping/shopping-during-corona/
Although this is using the correct offset (ERET register's), it is also
changing the REGNUM itself.  Therefore, raw_collect (regnum, ...) is
not reading from "pc" anymore.

Consequence:
This bug affects the "native ARC gdb" badly and causes kernel code to jump
to addresses after the breakpoint and not executing the "breakpoint"ed
instructions at all.  That "native ARC gdb" feature is not upstream yet and
is in review at the time of writing [1].

--
components: +Argument Clinic -Interpreter Core
nosy: +ahmedsayeed1982, larry -Arach, Arfrever, Huzaifa.Sidhpurwala, 
Jim.Jewett, Mark.Shannon, PaulMcMillan, Zhiping.Deng, alex, barry, 
benjamin.peterson, christian.heimes, cvrebert, dmalcolm, eric.araujo, 
eric.snow, fx5, georg.brandl, grahamd, gregory.p.smith, gvanrossum, gz, jcea, 
jsvaughan, lemburg, loewis, mark.dickinson, neologix, pitrou, python-dev, 
roger.serwy, skorgu, skrah, terry.reedy, tim.peters, v+python, vstinner, zbysz
versions: +Python 3.11 -Python 2.6, Python 2.7, Python 3.1, Python 3.2, Python 
3.3

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-03-13 Thread Gregory P. Smith

Gregory P. Smith  added the comment:

I believe so.  This is in all of the release candidates.

The expat/xmlparse.c hash collision DoS issue is being handled on its own via 
http://bugs.python.org/issue14234.

--
resolution:  -> fixed
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-03-13 Thread STINNER Victor

STINNER Victor  added the comment:

Can we close this issue?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-03-13 Thread Jon Vaughan

Jon Vaughan  added the comment:

Victor - yes that was it; a mixture of a 2.7.2 virtual env and 2.7.3.  
Apologies for any nuisance caused.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-03-12 Thread STINNER Victor

STINNER Victor  added the comment:

> FWIW I upgraded to ubuntu pangolin beta over the weekend,
> which includes 2.7.3rc1, ...
>
>  File "/usr/lib/python2.7/random.py", line 47, in 
>from os import urandom as _urandom
> ImportError: cannot import name urandom

It looks like you are using random.py of Python 2.7.3 with the Python program 
2.7.2, because os.urandom() is now always available in Python 2.7.3.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-03-12 Thread Jon Vaughan

Jon Vaughan  added the comment:

FWIW I upgraded to ubuntu pangolin beta over the weekend, which includes 
2.7.3rc1, and I'm also experiencing a problem with urandom.

  File "/usr/lib/python2.7/email/utils.py", line 27, in 
import random
  File "/usr/lib/python2.7/random.py", line 47, in 
from os import urandom as _urandom
ImportError: cannot import name urandom

Given Roger Serwy's comment it sounds like a beta ubuntu problem, but thought 
it worth mentioning.

--
nosy: +jsvaughan

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-03-09 Thread Kurt Seifried

Changes by Kurt Seifried :


--
nosy:  -kseifr...@redhat.com

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-03-09 Thread Kurt Seifried

Kurt Seifried  added the comment:

I have assigned CVE-2012-1150 for this issue as per 
http://www.openwall.com/lists/oss-security/2012/03/10/3

--
nosy: +kseifr...@redhat.com

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-03-03 Thread Chris Rebert

Chris Rebert  added the comment:

The Design and History FAQ (will) need a minor corresponding update:
http://docs.python.org/dev/faq/design.html#how-are-dictionaries-implemented

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-26 Thread Roger Serwy

Roger Serwy  added the comment:

It was a false alarm. I didn't recompile python before running it with the 
latest /Lib files. My apologies.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-26 Thread Benjamin Peterson

Benjamin Peterson  added the comment:

Can you paste the error you're getting?

2012/2/26 Roger Serwy :
>
> Roger Serwy  added the comment:
>
> After pulling the latest code, random.py no longer works since it tries to 
> import urandom from os on both 3.3 and 2.7.
>
> --
> nosy: +serwy
>
> ___
> Python tracker 
> 
> ___

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-26 Thread Roger Serwy

Roger Serwy  added the comment:

After pulling the latest code, random.py no longer works since it tries to 
import urandom from os on both 3.3 and 2.7.

--
nosy: +serwy

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-23 Thread Chris Rebert

Changes by Chris Rebert :


--
nosy: +cvrebert

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-22 Thread Barry A. Warsaw

Barry A. Warsaw  added the comment:

Never mind about sys.hash_seed.  See my follow up in python-dev.  I consider 
this issue is closed wrt the 2.6 branch.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-22 Thread Barry A. Warsaw

Barry A. Warsaw  added the comment:

I have to amend my suggestion about sys.flags.hash_randomization.  It needs to 
be non-zero even if $PYTHONHASHSEED is given instead of -R.  Many other flags 
that also have envars work the same way, e.g. -O and $PYTHONOPTIMIZE.  So 
hash_randomization has to work the same way.

I'll still work on a patch for exposing the seed in sys.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-21 Thread Gregory P. Smith

Gregory P. Smith  added the comment:

+1 to what barry and __ap__ discussed and settled on.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-21 Thread Barry A. Warsaw

Barry A. Warsaw  added the comment:

On Feb 21, 2012, at 09:48 AM, Marc-Andre Lemburg wrote:

>The flag should probably be removed - simply because
>the env var is not a flag, it's a configuration parameter.
>
>Exposing the seed value as sys.hashseed would be better and more useful
>to applications.

Okay, after chatting with __ap__ on irc, here's what I think the behavior
should be:

sys.flags.hash_randomization should contain just the value given by the -R
flag.  It should only be True if the flag is present, False otherwise.

sys.hash_seed contains the hash seed, set by virtue of the flag or envar.  It
should contain the *actual* seed value used.  E.g. it might be zero, the
explicitly set integer, or the randomly selected seed value in use during this
Python execution if a random seed was requested.

If you really need the envar value, getenv('PYTHONHASHSEED') is good enough
for that.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-21 Thread Barry A. Warsaw

Barry A. Warsaw  added the comment:

On Feb 21, 2012, at 09:48 AM, Marc-Andre Lemburg wrote:

>Exposing the seed value as sys.hashseed would be better and more useful
>to applications.

That makes the most sense to me.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-21 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

> That is a good question.  I don't really care either way, but let's
> say +0 for turning it off when seed == 0.

+1

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-21 Thread Marc-Andre Lemburg

Marc-Andre Lemburg  added the comment:

STINNER Victor wrote:
> 
> STINNER Victor  added the comment:
> 
>> Question: Should sys.flags.hash_randomization be True (1) when 
>> PYTHONHASHSEED=0?  It is now.
>>
>> Saying yes "working as intended" is fine by me.
> 
> It is documented that PYTHONHASHSEED=0 disables the randomization, so
> sys.flags.hash_randomization must be False (0).

PYTHONHASHSEED=1 will disable randomization as well :-)

Only setting PYTHONHASHSEED=random actually enables randomization.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com




[issue13703] Hash collision security issue

2012-02-21 Thread Marc-Andre Lemburg

Marc-Andre Lemburg  added the comment:

Gregory P. Smith wrote:
> 
> Gregory P. Smith  added the comment:
> 
> Question: Should sys.flags.hash_randomization be True (1) when 
> PYTHONHASHSEED=0?  It is now.

The flag should probably be removed - simply because
the env var is not a flag, it's a configuration parameter.

Exposing the seed value as sys.hashseed would be better and more useful
to applications.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-21 Thread STINNER Victor

STINNER Victor  added the comment:

> Question: Should sys.flags.hash_randomization be True (1) when 
> PYTHONHASHSEED=0?  It is now.
>
> Saying yes "working as intended" is fine by me.

It is documented that PYTHONHASHSEED=0 disables the randomization, so
sys.flags.hash_randomization must be False (0).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-20 Thread Georg Brandl

Georg Brandl  added the comment:

That is a good question.  I don't really care either way, but let's say +0 for 
turning it off when seed == 0.

-R still needs to be made default in 3.3 - that's one reason this issue is 
still open.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-20 Thread Gregory P. Smith

Gregory P. Smith  added the comment:

Question: Should sys.flags.hash_randomization be True (1) when 
PYTHONHASHSEED=0?  It is now.

Saying yes "working as intended" is fine by me.

sys.flags.hash_randomization seems to simply indicate that doing something with 
the hash seed was explicitly specified as opposed to defaulting to off, not 
that the hash seed was actually chosen randomly.

What this implies for 3.3 after we make hash randomization default to on is 
that sys.flags.hash_randomization will always be 1.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-20 Thread Gregory P. Smith

Gregory P. Smith  added the comment:

The bug report is the easiest thing to search for and follow when checking when 
something is resolved so it is nice to have a link to the relevant patch(es) 
for each branch.  I just wanted to note the major commit here so that all 
planned branches had a note recorded.  I don't care that it wasn't automatic. :)

For observers: There have been several more commits related to fixing this 
(test dict/set order fixes, bug/typo/merge oops fixes for the linked to 
patches, etc). Anyone interested in seeing the full list of diffs should look 
at their specific branch on our around the time of the linked to changelists.  
Too many to list here.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-20 Thread Georg Brandl

Georg Brandl  added the comment:

But since our workflow is such that commits in X.Y branches always show up in 
X.Y+1, it doesn't really matter.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-20 Thread Éric Araujo

Éric Araujo  added the comment:

Yep, the bot only looks at commit messages, it does not inspect merges or other 
topographical information.  That’s why some of us make sure to repeat bug 
numbers in our merge commit messages.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-20 Thread Gregory P. Smith

Gregory P. Smith  added the comment:

Roundup Robot didn't seem to notice it, but this has also been committed in 2.7:

http://hg.python.org/cpython/rev/a0f43f4481e0

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-20 Thread Roundup Robot

Roundup Robot  added the comment:

New changeset 6b7704fe1be1 by Barry Warsaw in branch '2.6':
- Issue #13703: oCERT-2011-003: add -R command-line option and PYTHONHASHSEED
http://hg.python.org/cpython/rev/6b7704fe1be1

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-20 Thread Roundup Robot

Roundup Robot  added the comment:

New changeset ed76dc34b39d by Georg Brandl in branch 'default':
Merge 3.2: Issue #13703 plus some related test suite fixes.
http://hg.python.org/cpython/rev/ed76dc34b39d

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-20 Thread Roundup Robot

Roundup Robot  added the comment:

New changeset 4a31f6b11e7a by Georg Brandl in branch '3.2':
Merge from 3.1: Issue #13703: add a way to randomize the hash values of basic 
types (str, bytes, datetime)
http://hg.python.org/cpython/rev/4a31f6b11e7a

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-20 Thread Roundup Robot

Roundup Robot  added the comment:

New changeset f4b7ecf8a5f8 by Georg Brandl in branch '3.1':
Issue #13703: add a way to randomize the hash values of basic types (str, 
bytes, datetime)
http://hg.python.org/cpython/rev/f4b7ecf8a5f8

--
nosy: +python-dev

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-19 Thread Benjamin Peterson

Benjamin Peterson  added the comment:

+1 for fixing all tests.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-19 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

> With PYTHONHASHSEED=random, at least those tests still fail:
> test_descr test_json test_set test_ttk_textonly test_urllib
> 
> Do we want to fix them in 3.1?

I don't know, but we'll have to fix them in 3.2 to avoid breaking the
buildbots. So we might also fix them in 3.1.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-19 Thread Éric Araujo

Éric Araujo  added the comment:

> With PYTHONHASHSEED=random, at least those tests still fail:
> test_descr test_json test_set test_ttk_textonly test_urllib
>
> Do we want to fix them in 3.1?

It the failures are caused by the test depending on dict order (i.e. not real 
bugs, not changed behavior), then I think we can live with them.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-19 Thread Georg Brandl

Georg Brandl  added the comment:

New patch fixes failures due to sys.flags backwards compatibility.

With PYTHONHASHSEED=random, at least those tests still fail:
test_descr test_json test_set test_ttk_textonly test_urllib

Do we want to fix them in 3.1?

--
Added file: http://bugs.python.org/file24563/hash-patch-3.1-gb-03.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-19 Thread Georg Brandl

Changes by Georg Brandl :


Removed file: http://bugs.python.org/file24562/hash-patch-3.1-gb.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-19 Thread Georg Brandl

Georg Brandl  added the comment:

New version, with the hope that it gets a "review" link.

--
Added file: http://bugs.python.org/file24562/hash-patch-3.1-gb.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-19 Thread Georg Brandl

Changes by Georg Brandl :


Removed file: http://bugs.python.org/file24561/hash-patch-3.1-gb.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-19 Thread Georg Brandl

Georg Brandl  added the comment:

Attaching reviewed version for 3.1 with unified env var PYTHONHASHSEED and 
encompassing Antoine's and Greg's review comments.

--
Added file: http://bugs.python.org/file24561/hash-patch-3.1-gb.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-15 Thread Martin v . Löwis

Martin v. Löwis  added the comment:

> Frankly, other short strings may give away even more, because you can
> put several into the same dict.

Please don't make such claims without some reasonable security analysis:
how *exactly* would you derive the hash seed when you have the hash
values of all 256 one-byte strings (or all 2**20 one-char Unicode
strings)?

> I would prefer that the randomization not kick in until strings are at
> least 8 characters, but I think excluding length 1 is a pretty obvious
> win.

-1. It is very easy to create a good number of hash collisions already
with 6-character strings. You are opening the security hole again that
we are attempting to close.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-14 Thread Jim Jewett

Jim Jewett  added the comment:

On Mon, Feb 13, 2012 at 3:37 PM,  Dave Malcolm
 added the comment:

>  * added comments about the specialcasing of length 0:
>    /*
>      We make the hash of the empty string be 0, rather than using
>      (prefix ^ suffix), since this slightly obfuscates the hash secret
>    */

Frankly, other short strings may give away even more, because you can
put several into the same dict.

I would prefer that the randomization not kick in until strings are at
least 8 characters, but I think excluding length 1 is a pretty obvious
win.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-13 Thread Marc-Andre Lemburg

Marc-Andre Lemburg  added the comment:

Dave Malcolm wrote:
> [new patch]

Please change how the env vars work as discussed earlier on this ticket.

Quick summary:

We only need one env var for the randomization logic: PYTHONHASHSEED.
If not set, 0 is used as seed. If set to a number, a fixed seed
is used. If set to "random", a random seed is generated at
interpreter startup.

Same for the -R cmd line option.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com



::: Try our new mxODBC.Connect Python Database Interface for free ! 

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-11 Thread Gregory P. Smith

Gregory P. Smith  added the comment:

Comments to be addressed added on the code review.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-11 Thread Gregory P. Smith

Gregory P. Smith  added the comment:

Should -R be required to take a parameter specifying "on" or "off" so
that code using a #! line continues to work as specified across the a
change in default behavior when upgrading from 3.2 to 3.3?

#!/usr/bin/python3 -R on
#!/usr/bin/python3 -R off

In 3.3 it would be a good idea to have a command line flag to turn
this off.  Rather than introducing a new flag in 3.3 a parameter that
is specific without regards for the default avoids that entirely.

before anyone suggests it: I do *not* think -R should accept a value
to use as the seed.  that is unnecessary.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-11 Thread Dave Malcolm

Dave Malcolm  added the comment:

I'm not quite sure how that would interact with the -R command-line
option for enabling randomization.

The changes to the docs in the latest patch clarifies the meaning of
what I've implemented (I hope).

My view is that we should simply enable hash randomization by default in
3.3

At that point, PYTHONHASHRANDOMIZATION and the -R option become
meaningless (and could be either removed, or silently ignored), and you
have to set PYTHONHASHSEED=0 to get back the old behavior.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-10 Thread Jim Jewett

Jim Jewett  added the comment:

On Fri, Feb 10, 2012 at 6:02 PM, STINNER Victor

>  - PYTHONHASHSEED doc is not clear: it should be mentionned
> that the variable is ignored if PYTHONHASHRANDOMIZATION
> is not set

*That* is why this two-envvar solution bothers me.

PYTHONHASHSEED has to be a string anyhow, so why not just get rid of
PYTHONHASHRANDOMIZATION?

Use PYTHONHASHSEED=random to use randomization.

Other values that cannot be turned into an integer will be (currently)
undefined.  (You may want to raise a fatal error, on the assumption
that errors should not pass silently.)

A missing PYTHONHASHSEED then has the pleasant interpretation of
defaulting to "0" for backwards compatibility.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-10 Thread STINNER Victor

STINNER Victor  added the comment:

Review of add-randomization-(...).patch:
 - there is a missing ")" in the doc, near "the types covered by the 
:option:`-R` option (or its equivalent, :envvar:`PYTHONHASHRANDOMIZATION`."
 - get_hash() in test_hash.py fails completly on Windows: Windows requires some 
environment variables. Just use env=os.environ.copy() instead of env={}.
 - PYTHONHASHSEED doc is not clear: it should be mentionned that the variable 
is ignored if PYTHONHASHRANDOMIZATION is not set
 - (Python 2.6) test_hash fails because of "[xxx refs]" in stderr if Python is 
compiled in debug mode. Add strip_python_stderr() to test_support.py and use it 
in get_hash().

def strip_python_stderr(stderr):
"""Strip the stderr of a Python process from potential debug output
emitted by the interpreter.

This will typically be run on the result of the communicate() method
of a subprocess.Popen object.
"""
stderr = re.sub(br"\[\d+ refs\]\r?\n?$", b"", stderr).strip()
return stderr

Except these minor nits, the patches (2.6 and 3.1) looks good. I didn't read 
the tests patches: just run the tests to test them :-) (Or our buildbots will 
do the work for you.)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-10 Thread Gregory P. Smith

Gregory P. Smith  added the comment:

Thanks for reviewing Benjamin.  I'm also reviewing this today.  Sorry
for the delay!

BTW, like Schadenfreude?  A hash collision DOS issue "fix" patch for
PHP5 was done poorly and introduced a new security vulnerability that
was just used to let script kiddies root many servers all around the
web:  http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2012-0830

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-10 Thread Benjamin Peterson

Benjamin Peterson  added the comment:

So modulo my (small) review comments, David's patches are ready to go in.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-08 Thread Marc-Andre Lemburg

Marc-Andre Lemburg  added the comment:

Dave Malcolm wrote:
> 
> If anyone is aware of an attack via numeric hashing that's actually
> possible, please let me know (privately).  I believe only specific apps
> could be affected, and I'm not aware of any such specific apps.

I'm not sure what you'd like to see.

Any application reading user provided data from a file, database,
web, etc. is vulnerable to the attack, if it uses the read numeric
data as keys in a dictionary.

The most common use case for this is a dictionary mapping codes or
IDs to strings or objects, e.g. for caching purposes, to find a list
of unique IDs, checking for duplicates, etc.

This also works indirectly on 32-bit platforms, e.g. via date/time
or IP address values that get converted to key integers.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-07 Thread Dave Malcolm

Dave Malcolm  added the comment:

On Mon, 2012-02-06 at 23:00 +, Marc-Andre Lemburg wrote:
> Marc-Andre Lemburg  added the comment:
> 
> Alex Gaynor wrote:
> > There's no need to cover any container types, because if their constituent
> > types are securely hashable then they will be as well.  And of course if
> > the constituent types are unsecure then they're directly vulnerable.
> 
> I wouldn't necessarily take that for granted: since container
> types usually calculate their hash based on the hashes of their
> elements, it's possible that a clever combination of elements
> could lead to a neutralization of the the hash seed used by
> the elements, thereby reenabling the original attack on the
> unprotected interpreter.
> 
> Still, because we have far more vulnerable hashable types out there,
> trying to find such an attack doesn't really make practical
> sense, so protecting containers is indeed not as urgent :-)

FWIW, I'm still awaiting review of my patches.  I don't believe
Marc-Andre's concerns are a sufficient rebuttal to the approach I've
taken.

If anyone is aware of an attack via numeric hashing that's actually
possible, please let me know (privately).  I believe only specific apps
could be affected, and I'm not aware of any such specific apps.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-06 Thread Marc-Andre Lemburg

Marc-Andre Lemburg  added the comment:

Alex Gaynor wrote:
> There's no need to cover any container types, because if their constituent
> types are securely hashable then they will be as well.  And of course if
> the constituent types are unsecure then they're directly vulnerable.

I wouldn't necessarily take that for granted: since container
types usually calculate their hash based on the hashes of their
elements, it's possible that a clever combination of elements
could lead to a neutralization of the the hash seed used by
the elements, thereby reenabling the original attack on the
unprotected interpreter.

Still, because we have far more vulnerable hashable types out there,
trying to find such an attack doesn't really make practical
sense, so protecting containers is indeed not as urgent :-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-06 Thread Alex Gaynor

Alex Gaynor  added the comment:

On Mon, Feb 6, 2012 at 5:04 PM, Marc-Andre Lemburg
wrote:

>
> Marc-Andre Lemburg  added the comment:
>
> Alex Gaynor wrote:
> > Can't randomization just be applied to integers as well?
>
> A simple seed xor'ed with the hash won't work, since the attacks
> I posted will continue to work (just colliding on a different hash
> value).
>
> Using a more elaborate hash algorithm would slow down uses of
> numbers as dictionary keys and also be difficult to implement for
> non-integer types such as float, longs and complex numbers. The
> reason is that Python applications expect x == y => hash(x) == hash(y),
> e.g. hash(3) == hash(3L) == hash(3.0) == hash(3+0j).
>
> AFAIK, the randomization patch also doesn't cover tuples, which are
> rather common as dictionary keys as well, nor any of the other
> more esoteric Python built-in hashable data types (e.g. frozenset)
> or hashable data types defined by 3rd party extensions or
> applications (simply because it can't).
>
> --
>
> ___
> Python tracker 
> 
> ___
>

There's no need to cover any container types, because if their constituent
types are securely hashable then they will be as well.  And of course if
the constituent types are unsecure then they're directly vulnerable.

Alex

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-06 Thread Marc-Andre Lemburg

Marc-Andre Lemburg  added the comment:

Alex Gaynor wrote:
> Can't randomization just be applied to integers as well?

A simple seed xor'ed with the hash won't work, since the attacks
I posted will continue to work (just colliding on a different hash
value).

Using a more elaborate hash algorithm would slow down uses of
numbers as dictionary keys and also be difficult to implement for
non-integer types such as float, longs and complex numbers. The
reason is that Python applications expect x == y => hash(x) == hash(y),
e.g. hash(3) == hash(3L) == hash(3.0) == hash(3+0j).

AFAIK, the randomization patch also doesn't cover tuples, which are
rather common as dictionary keys as well, nor any of the other
more esoteric Python built-in hashable data types (e.g. frozenset)
or hashable data types defined by 3rd party extensions or
applications (simply because it can't).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-06 Thread Dave Malcolm

Dave Malcolm  added the comment:

> Can't randomization just be applied to integers as well?
> 

It could, but see http://bugs.python.org/issue13703#msg151847

Would my patches be more or less likely to get reviewed with vs without
an extension of randomization to integers?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-06 Thread Alex Gaynor

Alex Gaynor  added the comment:

On Mon, Feb 6, 2012 at 4:41 PM, Marc-Andre Lemburg
wrote:

>
> Marc-Andre Lemburg  added the comment:
>
> Gregory P. Smith wrote:
> >
> > Gregory P. Smith  added the comment:
> >
> >>
> >>> The release managers have pronounced:
> >>> http://mail.python.org/pipermail/python-dev/2012-January/115892.html
> >>> Quoting that email:
>  1. Simple hash randomization is the way to go. We think this has the
>  best chance of actually fixing the problem while being fairly
>  straightforward such that we're comfortable putting it in a stable
>  release.
>  2. It will be off by default in stable releases and enabled by an
>  envar at runtime. This will prevent code breakage from dictionary
>  order changing as well as people depending on the hash stability.
> >>
> >> Right, but that doesn't contradict what I wrote about adding
> >> env vars to fix a seed and optionally enable using a random
> >> seed, or adding collision counting as extra protection for
> >> cases that are not addressed by the hash seeding, such as
> >> e.g. collisions caused by 3rd types or numbers.
> >
> > We won't be back-porting anything more than the hash randomization for
> > 2.6/2.7/3.1/3.2 but we are free to do more in 3.3 if someone can
> > demonstrate it working well and a need for it.
> >
> > For me, things like collision counting and tree based collision
> > buckets when the types are all the same and known comparable make
> > sense but are really sounding like a lot of additional complexity. I'd
> > *like* to see active black-box design attack code produced that goes
> > after something like a wsgi web app written in Python with hash
> > randomization *enabled* to demonstrate the need before we accept
> > additional protections like this  for 3.3+.
>
> I posted several examples for the integer collision attack on this
> ticket. The current randomization patch does not address this at all,
> the collision counting patch does, which is why I think both are
> needed.
>
> Note that my comment was more about the desire to *not* recommend
> using random hash seeds per default, but instead advocate using
> a random but fixed seed, or at least document that using random
> seeds that are set during interpreter startup will cause
> problems with repeatability of application runs.
>
> --
>
> ___
> Python tracker 
> 
> ___
>

Can't randomization just be applied to integers as well?

Alex

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-06 Thread Marc-Andre Lemburg

Marc-Andre Lemburg  added the comment:

Gregory P. Smith wrote:
> 
> Gregory P. Smith  added the comment:
> 
>>
>>> The release managers have pronounced:
>>> http://mail.python.org/pipermail/python-dev/2012-January/115892.html
>>> Quoting that email:
 1. Simple hash randomization is the way to go. We think this has the
 best chance of actually fixing the problem while being fairly
 straightforward such that we're comfortable putting it in a stable
 release.
 2. It will be off by default in stable releases and enabled by an
 envar at runtime. This will prevent code breakage from dictionary
 order changing as well as people depending on the hash stability.
>>
>> Right, but that doesn't contradict what I wrote about adding
>> env vars to fix a seed and optionally enable using a random
>> seed, or adding collision counting as extra protection for
>> cases that are not addressed by the hash seeding, such as
>> e.g. collisions caused by 3rd types or numbers.
> 
> We won't be back-porting anything more than the hash randomization for
> 2.6/2.7/3.1/3.2 but we are free to do more in 3.3 if someone can
> demonstrate it working well and a need for it.
> 
> For me, things like collision counting and tree based collision
> buckets when the types are all the same and known comparable make
> sense but are really sounding like a lot of additional complexity. I'd
> *like* to see active black-box design attack code produced that goes
> after something like a wsgi web app written in Python with hash
> randomization *enabled* to demonstrate the need before we accept
> additional protections like this  for 3.3+.

I posted several examples for the integer collision attack on this
ticket. The current randomization patch does not address this at all,
the collision counting patch does, which is why I think both are
needed.

Note that my comment was more about the desire to *not* recommend
using random hash seeds per default, but instead advocate using
a random but fixed seed, or at least document that using random
seeds that are set during interpreter startup will cause
problems with repeatability of application runs.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-06 Thread Gregory P. Smith

Gregory P. Smith  added the comment:

>
> > The release managers have pronounced:
> > http://mail.python.org/pipermail/python-dev/2012-January/115892.html
> > Quoting that email:
> >> 1. Simple hash randomization is the way to go. We think this has the
> >> best chance of actually fixing the problem while being fairly
> >> straightforward such that we're comfortable putting it in a stable
> >> release.
> >> 2. It will be off by default in stable releases and enabled by an
> >> envar at runtime. This will prevent code breakage from dictionary
> >> order changing as well as people depending on the hash stability.
>
> Right, but that doesn't contradict what I wrote about adding
> env vars to fix a seed and optionally enable using a random
> seed, or adding collision counting as extra protection for
> cases that are not addressed by the hash seeding, such as
> e.g. collisions caused by 3rd types or numbers.

We won't be back-porting anything more than the hash randomization for
2.6/2.7/3.1/3.2 but we are free to do more in 3.3 if someone can
demonstrate it working well and a need for it.

For me, things like collision counting and tree based collision
buckets when the types are all the same and known comparable make
sense but are really sounding like a lot of additional complexity. I'd
*like* to see active black-box design attack code produced that goes
after something like a wsgi web app written in Python with hash
randomization *enabled* to demonstrate the need before we accept
additional protections like this  for 3.3+.

-gps

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-06 Thread Marc-Andre Lemburg

Marc-Andre Lemburg  added the comment:

Antoine Pitrou wrote:
> 
> Antoine Pitrou  added the comment:
> 
>>> Right, but that doesn't contradict what I wrote about adding
>>> env vars to fix a seed and optionally enable using a random
>>> seed, or adding collision counting as extra protection for
>>> cases that are not addressed by the hash seeding, such as
>>> e.g. collisions caused by 3rd types or numbers.
>>
>> ... at least I hope not :-)
> 
> I think the env var part is a good idea (except that -1 as a magic value
> to enable randomization isn't great).

Agreed. Since it's an env var, using "random" would be a better choice.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-06 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

> > Right, but that doesn't contradict what I wrote about adding
> > env vars to fix a seed and optionally enable using a random
> > seed, or adding collision counting as extra protection for
> > cases that are not addressed by the hash seeding, such as
> > e.g. collisions caused by 3rd types or numbers.
> 
> ... at least I hope not :-)

I think the env var part is a good idea (except that -1 as a magic value
to enable randomization isn't great).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-06 Thread Marc-Andre Lemburg

Marc-Andre Lemburg  added the comment:

Marc-Andre Lemburg wrote:
> Dave Malcolm wrote:
>> The release managers have pronounced:
>> http://mail.python.org/pipermail/python-dev/2012-January/115892.html
>> Quoting that email:
>>> 1. Simple hash randomization is the way to go. We think this has the
>>> best chance of actually fixing the problem while being fairly
>>> straightforward such that we're comfortable putting it in a stable
>>> release.
>>> 2. It will be off by default in stable releases and enabled by an
>>> envar at runtime. This will prevent code breakage from dictionary
>>> order changing as well as people depending on the hash stability.
> 
> Right, but that doesn't contradict what I wrote about adding
> env vars to fix a seed and optionally enable using a random
> seed, or adding collision counting as extra protection for
> cases that are not addressed by the hash seeding, such as
> e.g. collisions caused by 3rd types or numbers.

... at least I hope not :-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-06 Thread Marc-Andre Lemburg

Marc-Andre Lemburg  added the comment:

Dave Malcolm wrote:
> 
>>> So the overhead in startup time is not an issue?
>>
>> It is an issue. Not only in terms of startup time, but also
>... 
>> because randomization per default makes Python behave in
>> non-deterministc ways - which is not what you want from a
>> programming language or interpreter (unless you explicitly
>> tell it to behave like that).
> 
> The release managers have pronounced:
> http://mail.python.org/pipermail/python-dev/2012-January/115892.html
> Quoting that email:
>> 1. Simple hash randomization is the way to go. We think this has the
>> best chance of actually fixing the problem while being fairly
>> straightforward such that we're comfortable putting it in a stable
>> release.
>> 2. It will be off by default in stable releases and enabled by an
>> envar at runtime. This will prevent code breakage from dictionary
>> order changing as well as people depending on the hash stability.

Right, but that doesn't contradict what I wrote about adding
env vars to fix a seed and optionally enable using a random
seed, or adding collision counting as extra protection for
cases that are not addressed by the hash seeding, such as
e.g. collisions caused by 3rd types or numbers.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-06 Thread Jim Jewett

Jim Jewett  added the comment:

On Mon, Feb 6, 2012 at 1:53 PM, Frank Sievertsen  wrote:

>>> BTW: If you set the limit N to e.g. 100 (which is reasonable given
>>> Victor's and my tests),

>> So it would take around 3Mb to cause a minute's delay...

> How did you calculate that?

16 bytes/entry * 3300 entries/second * 60 seconds/minute

But if there is indeed a way to cut that 16 bytes/entry, that is worse.

Switching dict implementations at 5 collisions is still acceptable,
except from a complexity standpoint.

-jJ

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-06 Thread Dave Malcolm

Dave Malcolm  added the comment:

On Mon, 2012-02-06 at 10:20 +, Marc-Andre Lemburg wrote:
> Marc-Andre Lemburg  added the comment:
> 
> STINNER Victor wrote:
> > 
> > STINNER Victor  added the comment:
> > 
> >> In a security fix release, we shouldn't change the linkage procedures,
> >> so I recommend that the LoadLibrary dance remains.
> > 
> > So the overhead in startup time is not an issue?
> 
> It is an issue. Not only in terms of startup time, but also

msg152362 indicated that there was negligible impact on startup time
when randomization is disabled.  The impact when it *is* enabled is
unclear, but reported there as "isn't crippling".

> because randomization per default makes Python behave in
> non-deterministc ways - which is not what you want from a
> programming language or interpreter (unless you explicitly
> tell it to behave like that).

The release managers have pronounced:
http://mail.python.org/pipermail/python-dev/2012-January/115892.html
Quoting that email:
> 1. Simple hash randomization is the way to go. We think this has the
> best chance of actually fixing the problem while being fairly
> straightforward such that we're comfortable putting it in a stable
> release.
> 2. It will be off by default in stable releases and enabled by an
> envar at runtime. This will prevent code breakage from dictionary
> order changing as well as people depending on the hash stability.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-06 Thread Marc-Andre Lemburg

Marc-Andre Lemburg  added the comment:

Jim Jewett wrote:
> 
>> BTW: If you set the limit N to e.g. 100 (which is reasonable given
>> Victor's and my tests),
> 
> Agreed.  Frankly, I think 5 would be more than reasonable so long as
> there is a fallback.
> 
>> the time it takes to process one of those
>> sets only takes 0.3 ms on my machine. That's hardly usable as basis
>> for an effective DoS attack.
> 
> So it would take around 3Mb to cause a minute's delay...

I'm not sure how you calculated that number.

Here's what I get: tale a dictionary with 100 integer collisions:
d = dict((x*(2**64 - 1), 1) for x in xrange(1, 100))

The repr(d) has 2713 bytes, which is a good approximation of how
much (string) data you have to send in order to trigger the
problem case.

If you can create  distinct integer sequences, you'll get a
processing time of about 1 second on my slow dev machine. The
resulting dict will likely have a repr() of around
60**2713 = 517MB.

So you need to send 517MB to cause my slow dev machine to consume
1 minute of CPU time. Today's servers are at least 10 times as fast as
my aging machine.

If you then take into account that the integer collision dictionary
is a very efficient collision example (size vs. effect), the attack
doesn't really sound practical anymore.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-06 Thread Frank Sievertsen

Frank Sievertsen  added the comment:

> Agreed; it tops out with a constant, but if it takes only 16 bytes of
> input to force another run through a 1000-long collision, that may
> still be too much leverage.

You should prepare the dict so that you have the collisions-run with a one-byte 
string or better with an even empty string, not a 16 bytes string.

> BTW: If you set the limit N to e.g. 100 (which is reasonable given
> Victor's and my tests),

100 is probably hard to exploit for a DoS attack. However
it makes it much easier to cause unwanted (future?) exceptions in
other apps.

> So it would take around 3Mb to cause a minute's delay...

How did you calculate that?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-06 Thread Jim Jewett

Jim Jewett  added the comment:

On Mon, Feb 6, 2012 at 12:07 PM, Marc-Andre Lemburg
 wrote:
>
> Marc-Andre Lemburg  added the comment:
>
> Jim Jewett wrote:

>> The problematic case is, roughly,

>> (1)  Find out what N will trigger collision-counting countermeasures.
>> (2)  Insert N-1 colliding entries, to make it as slow as possible.
>> (3)  Keep looking up (or updating) the N-1th entry, so that the
>> slow-as-possible-without-countermeasures path keeps getting rerun.

> Since N is constant, I don't see how such an "attack" could be used
> to trigger the O(n^2) worst-case behavior.

Agreed; it tops out with a constant, but if it takes only 16 bytes of
input to force another run through a 1000-long collision, that may
still be too much leverage.

> BTW: If you set the limit N to e.g. 100 (which is reasonable given
> Victor's and my tests),

Agreed.  Frankly, I think 5 would be more than reasonable so long as
there is a fallback.

> the time it takes to process one of those
> sets only takes 0.3 ms on my machine. That's hardly usable as basis
> for an effective DoS attack.

So it would take around 3Mb to cause a minute's delay...

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-06 Thread Marc-Andre Lemburg

Marc-Andre Lemburg  added the comment:

Jim Jewett wrote:
> 
> Jim Jewett  added the comment:
> 
> On Mon, Feb 6, 2012 at 8:12 AM, Marc-Andre Lemburg
>  wrote:
>>
>> Marc-Andre Lemburg  added the comment:
>>
>> Antoine Pitrou wrote:
>>>
>>> The simple collision counting approach leaves a gaping hole open, as
>>> demonstrated by Frank.
> 
>> Could you elaborate on this ?
> 
>> Note that I've updated the collision counting patch to cover both
>> possible attack cases I mentioned in 
>> http://bugs.python.org/issue13703#msg150724.
>> If there's another case I'm unaware of, please let me know.
> 
> The problematic case is, roughly,
> 
> (1)  Find out what N will trigger collision-counting countermeasures.
> (2)  Insert N-1 colliding entries, to make it as slow as possible.
> (3)  Keep looking up (or updating) the N-1th entry, so that the
> slow-as-possible-without-countermeasures path keeps getting rerun.

Since N is constant, I don't see how such an "attack" could be used
to trigger the O(n^2) worst-case behavior. Even if you can create n sets
of entries that each fill up N-1 positions, the overall performance
will still be O(n*N*(N-1)/2) = O(n).

So in the end, we're talking about a regular brute force DoS attack,
which requires different measures than dictionary implementation
tricks :-)

BTW: If you set the limit N to e.g. 100 (which is reasonable given
Victor's and my tests), the time it takes to process one of those
sets only takes 0.3 ms on my machine. That's hardly usable as basis
for an effective DoS attack.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-06 Thread Jim Jewett

Jim Jewett  added the comment:

On Mon, Feb 6, 2012 at 8:12 AM, Marc-Andre Lemburg
 wrote:
>
> Marc-Andre Lemburg  added the comment:
>
> Antoine Pitrou wrote:
>>
>> The simple collision counting approach leaves a gaping hole open, as
>> demonstrated by Frank.

> Could you elaborate on this ?

> Note that I've updated the collision counting patch to cover both
> possible attack cases I mentioned in 
> http://bugs.python.org/issue13703#msg150724.
> If there's another case I'm unaware of, please let me know.

The problematic case is, roughly,

(1)  Find out what N will trigger collision-counting countermeasures.
(2)  Insert N-1 colliding entries, to make it as slow as possible.
(3)  Keep looking up (or updating) the N-1th entry, so that the
slow-as-possible-without-countermeasures path keeps getting rerun.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-06 Thread Marc-Andre Lemburg

Marc-Andre Lemburg  added the comment:

Antoine Pitrou wrote:
> 
> The simple collision counting approach leaves a gaping hole open, as
> demonstrated by Frank.

Could you elaborate on this ?

Note that I've updated the collision counting patch to cover both
possible attack cases I mentioned in 
http://bugs.python.org/issue13703#msg150724.
If there's another case I'm unaware of, please let me know.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-06 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

> It is an issue. Not only in terms of startup time, but also
> because randomization per default makes Python behave in
> non-deterministc ways - which is not what you want from a
> programming language or interpreter (unless you explicitly
> tell it to behave like that).

That's debatable. For example id() is fairly unpredictable accross runs
(except for statically-allocated instances).

> I think it would be much better to just let the user
> define a hash seed using environment variables for Python
> to use and then forget about how this variable value is
> determined. If it's not set, Python uses 0 as seed, thereby
> disabling the seeding logic.
> 
> This approach would have Python behave in a deterministic way
> per default and still allow users who wish to use a different
> seed, set this to a different value - even on a case by case
> basis.
> 
> If you absolutely want to add a feature to have the seed set
> randomly, you could make a seed value of -1 trigger the use
> of a random number source as seed.

Having both may indeed be a good idea.

> I also still firmly believe that the collision counting scheme
> should be made available via an environment variable as well.
> The user could then set the variable to e.g. 1000 to have it
> enabled with limit 1000, or leave it undefined to disable the
> collision counting.
> 
> With those two tools, users could then choose the method they
> find most attractive for their purposes.

It's not about being attractive, it's about fixing the security problem.
The simple collision counting approach leaves a gaping hole open, as
demonstrated by Frank.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-06 Thread Marc-Andre Lemburg

Marc-Andre Lemburg  added the comment:

STINNER Victor wrote:
> 
> STINNER Victor  added the comment:
> 
>> In a security fix release, we shouldn't change the linkage procedures,
>> so I recommend that the LoadLibrary dance remains.
> 
> So the overhead in startup time is not an issue?

It is an issue. Not only in terms of startup time, but also
because randomization per default makes Python behave in
non-deterministc ways - which is not what you want from a
programming language or interpreter (unless you explicitly
tell it to behave like that).

I think it would be much better to just let the user
define a hash seed using environment variables for Python
to use and then forget about how this variable value is
determined. If it's not set, Python uses 0 as seed, thereby
disabling the seeding logic.

This approach would have Python behave in a deterministic way
per default and still allow users who wish to use a different
seed, set this to a different value - even on a case by case
basis.

If you absolutely want to add a feature to have the seed set
randomly, you could make a seed value of -1 trigger the use
of a random number source as seed.

I also still firmly believe that the collision counting scheme
should be made available via an environment variable as well.
The user could then set the variable to e.g. 1000 to have it
enabled with limit 1000, or leave it undefined to disable the
collision counting.

With those two tools, users could then choose the method they
find most attractive for their purposes.

By default, they would be disabled, but applications which are
exposed to untrusted user data and use dictionaries for managing
such data could check whether the protections are enabled and
trigger a startup error if needed.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-06 Thread STINNER Victor

STINNER Victor  added the comment:

> In a security fix release, we shouldn't change the linkage procedures,
> so I recommend that the LoadLibrary dance remains.

So the overhead in startup time is not an issue?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-05 Thread Martin v . Löwis

Martin v. Löwis  added the comment:

IIUC, Win9x and NT4 are not supported anymore in any of the target releases of 
the patch, so calling CryptGenRandom should be fine.

In a security fix release, we shouldn't change the linkage procedures, so I 
recommend that the LoadLibrary dance remains.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-02-01 Thread STINNER Victor

STINNER Victor  added the comment:

It looks like it was not yet decided if the CryptoGen API or a weak LCG should 
be used on Windows. Extract of 
add-randomization-to-3.1-dmalcolm-2012-02-01-001.patch:

+#ifdef MS_WINDOWS
+#if 1
+(void)win32_urandom((unsigned char *)secret, secret_size, 0);
+#else
+/* fast but weak RNG (fast initialization, weak seed) */

Does someone know how to link Python to advapi32.dll (on Windows) to avoid 
GetModuleHandle("advapi32.dll")?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-01-30 Thread Dave Malcolm

Dave Malcolm  added the comment:

Am attaching a backport of 
optin-hash-randomization-for-3.1-dmalcolm-2012-01-30-002.patch to 2.6

Randomization covers the str, unicode and buffer types; equality of hashes is 
preserved for these types.

--
Added file: 
http://bugs.python.org/file24375/optin-hash-randomization-for-2.6-dmalcolm-2012-01-30-001.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-01-30 Thread Martin

Martin  added the comment:

> Has anyone had a chance to try this patch on Windows?  Martin?  I'm
> hoping that it doesn't impose a startup cost in the default
> no-randomization cost, and that any startup cost in the -R case is
> acceptable.

Just tested as requested. Is the patch against 3.1 for a reason? Can't
really be compared to earlier results, but get enough weird outliers
that that may not be useful anyway. Also needed the following change:

-+chunk = Py_MIN(size, INT_MAX);
++chunk = size > INT_MAX ? INT_MAX : size;

Summary, looks like extra work in the default case is avoided and
isn't crippling otherwise, though there were a couple of very odd runs
not presented probably due to other disk access.

Vanilla:

>timeit PCbuild\python.exe -c "import sys;print(sys.version)"
3.1.4+ (default, Jan 30 2012, 22:38:52) [MSC v.1500 32 bit (Intel)]

Version Number:   Windows NT 5.1 (Build 2600)
Exit Time:10:42 pm, Monday, January 30 2012
Elapsed Time: 0:00:00.218
Process Time: 0:00:00.187
System Calls: 3974
Context Switches: 574
Page Faults:  1696
Bytes Read:   480331
Bytes Written:0
Bytes Other:  190860


Patched:

>timeit PCbuild\python.exe -c "import sys;print(sys.version)"
3.1.4+ (default, Jan 30 2012, 22:55:06) [MSC v.1500 32 bit (Intel)]

Version Number:   Windows NT 5.1 (Build 2600)
Exit Time:10:55 pm, Monday, January 30 2012
Elapsed Time: 0:00:00.218
Process Time: 0:00:00.187
System Calls: 3560
Context Switches: 441
Page Faults:  1660
Bytes Read:   461956
Bytes Written:0
Bytes Other:  24926


>timeit PCbuild\python.exe -Rc "import sys;print(sys.version)"
3.1.4+ (default, Jan 30 2012, 22:55:06) [MSC v.1500 32 bit (Intel)]

Version Number:   Windows NT 5.1 (Build 2600)
Exit Time:11:05 pm, Monday, January 30 2012
Elapsed Time: 0:00:00.249
Process Time: 0:00:00.234
System Calls: 3959
Context Switches: 483
Page Faults:  1847
Bytes Read:   892464
Bytes Written:0
Bytes Other:  27090

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-01-30 Thread Dave Malcolm

Dave Malcolm  added the comment:

I slightly messed up the test_hash.py changes.

Revised patch attached:
  optin-hash-randomization-for-3.1-dmalcolm-2012-01-30-002.patch

--
Added file: 
http://bugs.python.org/file24371/optin-hash-randomization-for-3.1-dmalcolm-2012-01-30-002.patch

___
Python tracker 

___diff -r e7706bdaaa0d Doc/library/sys.rst
--- a/Doc/library/sys.rst   Fri Jan 27 09:48:47 2012 +0100
+++ b/Doc/library/sys.rst   Mon Jan 30 17:21:17 2012 -0500
@@ -220,6 +220,7 @@
:const:`ignore_environment`   :option:`-E`
:const:`verbose`  :option:`-v`
:const:`bytes_warning`:option:`-b`
+   :const:`hash_randomization`   :option:`-R`
= =
 
 
diff -r e7706bdaaa0d Doc/reference/datamodel.rst
--- a/Doc/reference/datamodel.rst   Fri Jan 27 09:48:47 2012 +0100
+++ b/Doc/reference/datamodel.rst   Mon Jan 30 17:21:17 2012 -0500
@@ -1265,6 +1265,8 @@
inheritance of :meth:`__hash__` will be blocked, just as if :attr:`__hash__`
had been explicitly set to :const:`None`.
 
+   See also the :option:`-R` command-line option.
+
 
 .. method:: object.__bool__(self)
 
diff -r e7706bdaaa0d Doc/using/cmdline.rst
--- a/Doc/using/cmdline.rst Fri Jan 27 09:48:47 2012 +0100
+++ b/Doc/using/cmdline.rst Mon Jan 30 17:21:17 2012 -0500
@@ -21,7 +21,7 @@
 
 When invoking Python, you may specify any of these options::
 
-python [-bBdEhiOsSuvVWx?] [-c command | -m module-name | script | - ] 
[args]
+python [-bBdEhiORsSuvVWx?] [-c command | -m module-name | script | - ] 
[args]
 
 The most common use case is, of course, a simple invocation of a script::
 
@@ -215,6 +215,30 @@
Discard docstrings in addition to the :option:`-O` optimizations.
 
 
+.. cmdoption:: -R
+
+   Turn on "hash randomization, so that the :meth:`__hash__` values of str,
+   bytes and datetime objects are "salted" with an unpredictable random value.
+   Although they remain constant within an individual Python process, they
+   are not predictable between repeated invocations of Python.
+
+   This is intended to provide protection against a denial-of-service
+   caused by carefully-chosen inputs that exploit the worst case performance
+   of a dict lookup, O(n^2) complexity.  See:
+
+   http://www.ocert.org/advisories/ocert-2011-003.html
+
+   for details.
+
+   Changing hash values affects the order in which keys are retrieved from
+   a dict.  Although Python has never made guarantees about this ordering
+   (and it typically varies between 32-bit and 64-bit builds), enough
+   real-world code implicitly relies on this non-guaranteed behavior that
+   the randomization is disabled by default.
+
+   See also :envvar:`PYTHONHASHRANDOMIZATION`.
+
+
 .. cmdoption:: -s
 
Don't add user site directory to sys.path
@@ -435,6 +459,24 @@
import of source modules.
 
 
+.. envvar:: PYTHONHASHRANDOMIZATION
+
+   If this is set to a non-empty string it is equivalent to specifying the
+   :option:`-R` option.
+
+
+.. envvar:: PYTHONHASHSEED
+
+   If this is set, it is used as a fixed seed for generating the hash() of
+   the types covered by the :option:`-R` option (or its equivalent,
+   :envvar:`PYTHONHASHRANDOMIZATION`.
+
+   Its purpose is for use in selftests for the interpreter.
+
+   It should be a decimal number in the range [0; 4294967295].  Specifying
+   the value 0 overrides the other setting, disabling the hash random salt.
+
+
 .. envvar:: PYTHONIOENCODING
 
Overrides the encoding used for stdin/stdout/stderr, in the syntax
diff -r e7706bdaaa0d Include/object.h
--- a/Include/object.h  Fri Jan 27 09:48:47 2012 +0100
+++ b/Include/object.h  Mon Jan 30 17:21:17 2012 -0500
@@ -473,6 +473,12 @@
 PyAPI_FUNC(long) _Py_HashDouble(double);
 PyAPI_FUNC(long) _Py_HashPointer(void*);
 
+typedef struct {
+long prefix;
+long suffix;
+} _Py_HashSecret_t;
+PyAPI_DATA(_Py_HashSecret_t) _Py_HashSecret;
+
 /* Helper for passing objects to printf and the like */
 #define PyObject_REPR(obj) _PyUnicode_AsString(PyObject_Repr(obj))
 
diff -r e7706bdaaa0d Include/pydebug.h
--- a/Include/pydebug.h Fri Jan 27 09:48:47 2012 +0100
+++ b/Include/pydebug.h Mon Jan 30 17:21:17 2012 -0500
@@ -19,6 +19,7 @@
 PyAPI_DATA(int) Py_DontWriteBytecodeFlag;
 PyAPI_DATA(int) Py_NoUserSiteDirectory;
 PyAPI_DATA(int) Py_UnbufferedStdioFlag;
+PyAPI_DATA(int) Py_HashRandomizationFlag;
 
 /* this is a wrapper around getenv() that pays attention to
Py_IgnoreEnvironmentFlag.  It should be used for getting variables like
diff -r e7706bdaaa0d Include/pythonrun.h
--- a/Include/pythonrun.h   Fri Jan 27 09:48:47 2012 +0100
+++ b/Include/pythonrun.h   Mon Jan 30 17:21:17 2012 -0500
@@ -174,6 +174,8 @@
 PyAPI_FUNC(PyOS_sighandler_t) PyOS_getsig(int);
 PyAPI_FUNC(PyOS_sighandler_t) PyOS_setsig(int, PyOS_sighandler_t);
 
+/* Random */
+Py

[issue13703] Hash collision security issue

2012-01-30 Thread Jim Jewett

Jim Jewett  added the comment:

On Mon, Jan 30, 2012 at 12:31 PM,  Dave Malcolm 
added the comment:

> It's useful for the selftests, so I've kept PYTHONHASHSEED.

The reason to read PYTHONHASHSEED was so that multiple members of a
cluster could use the same hash.

It would have been nice to have fewer environment variables, but I'll
grant that it is hard to say "use something random that we have *not*
precomputed" without either a config file or a magic value for
PYTHONHASHSEED.

-jJ

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-01-30 Thread Dave Malcolm

Dave Malcolm  added the comment:

It's useful for the selftests, so I've kept PYTHONHASHSEED.  However,
I've removed it from the man page; the only other place it's mentioned
(in Doc/using/cmdline.rst) I now explicitly say that it exists just to
serve the interpreter's own selftests.

Am attaching a revised patch, which has the above change, plus some
tweaks to Lib/test/test_hash.py (adds test coverage for the datetime
hash randomization):
  optin-hash-randomization-for-3.1-dmalcolm-2012-01-30-001.patch

Has anyone had a chance to try this patch on Windows?  Martin?  I'm
hoping that it doesn't impose a startup cost in the default
no-randomization cost, and that any startup cost in the -R case is
acceptable.

--
Added file: 
http://bugs.python.org/file24370/optin-hash-randomization-for-3.1-dmalcolm-2012-01-30-001.patch

___
Python tracker 

___diff -r 73dad4940b88 Doc/library/sys.rst
--- a/Doc/library/sys.rst   Fri Jan 20 11:23:02 2012 +
+++ b/Doc/library/sys.rst   Mon Jan 30 12:29:09 2012 -0500
@@ -220,6 +220,7 @@
:const:`ignore_environment`   :option:`-E`
:const:`verbose`  :option:`-v`
:const:`bytes_warning`:option:`-b`
+   :const:`hash_randomization`   :option:`-R`
= =
 
 
diff -r 73dad4940b88 Doc/reference/datamodel.rst
--- a/Doc/reference/datamodel.rst   Fri Jan 20 11:23:02 2012 +
+++ b/Doc/reference/datamodel.rst   Mon Jan 30 12:29:09 2012 -0500
@@ -1265,6 +1265,8 @@
inheritance of :meth:`__hash__` will be blocked, just as if :attr:`__hash__`
had been explicitly set to :const:`None`.
 
+   See also the :option:`-R` command-line option.
+
 
 .. method:: object.__bool__(self)
 
diff -r 73dad4940b88 Doc/using/cmdline.rst
--- a/Doc/using/cmdline.rst Fri Jan 20 11:23:02 2012 +
+++ b/Doc/using/cmdline.rst Mon Jan 30 12:29:09 2012 -0500
@@ -21,7 +21,7 @@
 
 When invoking Python, you may specify any of these options::
 
-python [-bBdEhiOsSuvVWx?] [-c command | -m module-name | script | - ] 
[args]
+python [-bBdEhiORsSuvVWx?] [-c command | -m module-name | script | - ] 
[args]
 
 The most common use case is, of course, a simple invocation of a script::
 
@@ -215,6 +215,30 @@
Discard docstrings in addition to the :option:`-O` optimizations.
 
 
+.. cmdoption:: -R
+
+   Turn on "hash randomization, so that the :meth:`__hash__` values of str,
+   bytes and datetime objects are "salted" with an unpredictable random value.
+   Although they remain constant within an individual Python process, they
+   are not predictable between repeated invocations of Python.
+
+   This is intended to provide protection against a denial-of-service
+   caused by carefully-chosen inputs that exploit the worst case performance
+   of a dict lookup, O(n^2) complexity.  See:
+
+   http://www.ocert.org/advisories/ocert-2011-003.html
+
+   for details.
+
+   Changing hash values affects the order in which keys are retrieved from
+   a dict.  Although Python has never made guarantees about this ordering
+   (and it typically varies between 32-bit and 64-bit builds), enough
+   real-world code implicitly relies on this non-guaranteed behavior that
+   the randomization is disabled by default.
+
+   See also :envvar:`PYTHONHASHRANDOMIZATION`.
+
+
 .. cmdoption:: -s
 
Don't add user site directory to sys.path
@@ -435,6 +459,24 @@
import of source modules.
 
 
+.. envvar:: PYTHONHASHRANDOMIZATION
+
+   If this is set to a non-empty string it is equivalent to specifying the
+   :option:`-R` option.
+
+
+.. envvar:: PYTHONHASHSEED
+
+   If this is set, it is used as a fixed seed for generating the hash() of
+   the types covered by the :option:`-R` option (or its equivalent,
+   :envvar:`PYTHONHASHRANDOMIZATION`.
+
+   Its purpose is for use in selftests for the interpreter.
+
+   It should be a decimal number in the range [0; 4294967295].  Specifying
+   the value 0 overrides the other setting, disabling the hash random salt.
+
+
 .. envvar:: PYTHONIOENCODING
 
Overrides the encoding used for stdin/stdout/stderr, in the syntax
diff -r 73dad4940b88 Include/object.h
--- a/Include/object.h  Fri Jan 20 11:23:02 2012 +
+++ b/Include/object.h  Mon Jan 30 12:29:09 2012 -0500
@@ -473,6 +473,12 @@
 PyAPI_FUNC(long) _Py_HashDouble(double);
 PyAPI_FUNC(long) _Py_HashPointer(void*);
 
+typedef struct {
+long prefix;
+long suffix;
+} _Py_HashSecret_t;
+PyAPI_DATA(_Py_HashSecret_t) _Py_HashSecret;
+
 /* Helper for passing objects to printf and the like */
 #define PyObject_REPR(obj) _PyUnicode_AsString(PyObject_Repr(obj))
 
diff -r 73dad4940b88 Include/pydebug.h
--- a/Include/pydebug.h Fri Jan 20 11:23:02 2012 +
+++ b/Include/pydebug.h Mon Jan 30 12:29:09 2012 -0500
@@ -19,6 +19,7 @@
 PyAPI_DATA(int) Py_DontWriteBytecodeFlag;
 PyAPI_DATA(int) Py_NoUserSiteDirectory;

[issue13703] Hash collision security issue

2012-01-30 Thread Martin v . Löwis

Martin v. Löwis  added the comment:

> Rather than the "" empty string for off I suggest an explicit string
> that makes it clear what the meaning is.  PYTHONHASHSEED="disabled"
> perhaps.
> 
> Agreed, if we can have a single env var that is preferred.  It is more
> obvious that the PYTHONHASHSEED env var. has no effect when it is set
> to a special value rather than when it is set to something but it is
> configured to be ignored by a _different_ env var.

I think this is bike-shedding. The requirements for environment
variables are
a) with no variable set, it must not do randomization
b) there must be a way to seed from the platform's RNG
Having an explicit seed actually is no requirement, so I'd propose
to drop PYTHONHASHSEED instead.

However, I really suggest to let the patch author (Dave Malcolm)
design the API within the constraints.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-01-29 Thread Gregory P. Smith

Gregory P. Smith  added the comment:

> What about PYTHONHASHSEED= -> off, PYTHONHASHSEED=0 -> random,
> PYTHONHASHSEED=n -> n ? I agree with Jim that it's better to have one
> env. variable than two.

Rather than the "" empty string for off I suggest an explicit string
that makes it clear what the meaning is.  PYTHONHASHSEED="disabled"
perhaps.

Agreed, if we can have a single env var that is preferred.  It is more
obvious that the PYTHONHASHSEED env var. has no effect when it is set
to a special value rather than when it is set to something but it is
configured to be ignored by a _different_ env var.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-01-29 Thread Zbyszek Szmek

Zbyszek Szmek  added the comment:

What about PYTHONHASHSEED= -> off, PYTHONHASHSEED=0 -> random, 
PYTHONHASHSEED=n -> n ? I agree with Jim that it's better to have one 
env. variable than two.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-01-29 Thread Dave Malcolm

Dave Malcolm  added the comment:

On Sat, 2012-01-28 at 23:56 +, Terry J. Reedy wrote:
> Terry J. Reedy  added the comment:
> 
> > I think you should check with randomization enabled, if only to see the
> > nature of the failures and if they are expected.
> 
> Including the list of when-enabled expected failures in the release 
> notes would help those who compile and test.

Am attaching a patch which fixes various problems that are clearly just
assumptions about dict ordering:
  fix-unittests-broken-by-randomization-dmalcolm-2012-01-29-001.patch

 json/__init__.py|4 +++-
 test/mapping_tests.py   |2 +-
 test/test_descr.py  |   12 +++-
 test/test_urllib.py |4 +++-
 tkinter/test/test_ttk/test_functions.py |2 +-
 5 files changed, 19 insertions(+), 5 deletions(-)

Here are the issues that it fixes:
Lib/test/test_descr.py: fix for intermittent failure due to dict repr:
  File "Lib/test/test_descr.py", line 4304, in test_repr
self.assertEqual(repr(self.C.__dict__), 
'dict_proxy({!r})'.format(dict_))
AssertionError: "dict_proxy({'__module__': 'test.test_descr', '__dict__': 
, '__doc__': None, '__weakref__': 
, 'meth': })"
 != "dict_proxy({'__module__': 'test.test_descr', '__doc__': 
None, '__weakref__': , 'meth': 
, '__dict__': })"

Lib/json/__init__.py: fix (based on haypo's work) for intermittent failure:
Failed example:
json.dumps([1,2,3,{'4': 5, '6': 7}], separators=(',', ':'))
Expected:
'[1,2,3,{"4":5,"6":7}]'
Got:
'[1,2,3,{"6":7,"4":5}]'

Lib/test/mapping_tests.py: fix (based on haypo's work) for intermittent 
failures of test_collections, test_dict, and test_userdict seen here:
==
ERROR: test_update (__main__.GeneralMappingTests)
--
Traceback (most recent call last):
  File "Lib/test/mapping_tests.py", line 207, in test_update
i1 = sorted(d.items())
TypeError: unorderable types: str() < int()

Lib/test/test_urllib.py: fix (based on haypo's work) for intermittent failure:
==
FAIL: test_nonstring_seq_values (__main__.urlencode_Tests)
--
Traceback (most recent call last):
  File "Lib/test/test_urllib.py", line 844, in test_nonstring_seq_values
urllib.parse.urlencode({"a": {"a": 1, "b": 1}}, True))
AssertionError: 'a=a&a=b' != 'a=b&a=a'
--

Lib/tkinter/test/test_ttk/test_functions.py: fix from haypo's patch for 
intermittent failure:
Traceback (most recent call last):
  File "Lib/tkinter/test/test_ttk/test_functions.py", line 146, in 
test_format_elemcreate
('a', 'b'), a='x', b='y'), ("test a b", ("-a", "x", "-b", "y")))
AssertionError: Tuples differ: ('test a b', ('-b', 'y', '-a',... != ('test 
a b', ('-a', 'x', '-b',...

I see two remaining issues (which this patch doesn't address):
test test_module failed -- Traceback (most recent call last):
  File "Lib/test/test_module.py", line 79, in test_clear_dict_in_ref_cycle
self.assertEqual(destroyed, [1])
AssertionError: Lists differ: [] != [1]

test_multiprocessing
Exception AssertionError: AssertionError() in  ignored

--
Added file: http://bugs.python.org/file24366/unnamed

___
Python tracker 

___diff -r 73dad4940b88 Lib/json/__init__.py
--- a/Lib/json/__init__.py	Fri Jan 20 11:23:02 2012 +
+++ b/Lib/json/__init__.py	Sun Jan 29 20:20:43 2012 -0500
@@ -31,7 +31,9 @@
 Compact encoding::
 
 >>> import json
->>> json.dumps([1,2,3,{'4': 5, '6': 7}], separators=(',', ':'))
+>>> from collections import OrderedDict
+>>> mydict = OrderedDict([('4', 5), ('6', 7)])
+>>> json.dumps([1,2,3,mydict], separators=(',', ':'))
 '[1,2,3,{"4":5,"6":7}]'
 
 Pretty printing::
diff -r 73dad4940b88 Lib/test/mapping_tests.py
--- a/Lib/test/mapping_tests.py	Fri Jan 20 11:23:02 2012 +
+++ b/Lib/test/mapping_tests.py	Sun Jan 29 20:20:43 2012 -0500
@@ -14,7 +14,7 @@
 def _reference(self):
 """Return a dictionary of values which are invariant by storage
 in the object under test."""
-return {1:2, "key1":"value1", "key2":(1,2,3)}
+return {"1": "2", "key1":"value1", "key2":(1,2,3)}
 def _empty_mapping(self):
 """Return an empty mapping object"""
 return self.type2test()
diff -r 73dad4940b88 Lib/test/test_descr.py
--- a/Lib/test/test_descr.py	Fri Jan 20 11:23:02 2012 +
+++ b/Lib/test/test_descr.py	Sun Jan 29 20:20:43 2012 -0500
@@ -4300,8 +4300,18 @@
 
 def test_repr(self):
 # Test

[issue13703] Hash collision security issue

2012-01-29 Thread Dave Malcolm

Dave Malcolm  added the comment:

On Sun, 2012-01-29 at 00:06 +, Dave Malcolm wrote:

I went ahead and added the flag to sys.flags, so now
  $ make test TESTPYTHONOPTS=-R
shows:
Testing with flags: sys.flags(debug=0, division_warning=0, inspect=0,
interactive=0, optimize=0, dont_write_bytecode=0, no_user_site=0,
no_site=0, ignore_environment=1, verbose=0, bytes_warning=2,
hash_randomization=1)

...note the:
  hash_randomization=1
at the end of sys.flags.  (This seems useful for making it absolutely
clear if you're getting randomization or not).  Hopefully I'm not
creating too much work for the other Python implementations.

Am attaching new version of patch for 3.1:
  optin-hash-randomization-for-3.1-dmalcolm-2012-01-29-001.patch

--
Added file: 
http://bugs.python.org/file24365/optin-hash-randomization-for-3.1-dmalcolm-2012-01-29-001.patch

___
Python tracker 

___diff -r 73dad4940b88 Doc/library/sys.rst
--- a/Doc/library/sys.rst   Fri Jan 20 11:23:02 2012 +
+++ b/Doc/library/sys.rst   Sun Jan 29 20:19:11 2012 -0500
@@ -220,6 +220,7 @@
:const:`ignore_environment`   :option:`-E`
:const:`verbose`  :option:`-v`
:const:`bytes_warning`:option:`-b`
+   :const:`hash_randomization`   :option:`-R`
= =
 
 
diff -r 73dad4940b88 Doc/reference/datamodel.rst
--- a/Doc/reference/datamodel.rst   Fri Jan 20 11:23:02 2012 +
+++ b/Doc/reference/datamodel.rst   Sun Jan 29 20:19:11 2012 -0500
@@ -1265,6 +1265,8 @@
inheritance of :meth:`__hash__` will be blocked, just as if :attr:`__hash__`
had been explicitly set to :const:`None`.
 
+   See also the :option:`-R` command-line option.
+
 
 .. method:: object.__bool__(self)
 
diff -r 73dad4940b88 Doc/using/cmdline.rst
--- a/Doc/using/cmdline.rst Fri Jan 20 11:23:02 2012 +
+++ b/Doc/using/cmdline.rst Sun Jan 29 20:19:11 2012 -0500
@@ -21,7 +21,7 @@
 
 When invoking Python, you may specify any of these options::
 
-python [-bBdEhiOsSuvVWx?] [-c command | -m module-name | script | - ] 
[args]
+python [-bBdEhiORsSuvVWx?] [-c command | -m module-name | script | - ] 
[args]
 
 The most common use case is, of course, a simple invocation of a script::
 
@@ -215,6 +215,30 @@
Discard docstrings in addition to the :option:`-O` optimizations.
 
 
+.. cmdoption:: -R
+
+   Turn on "hash randomization, so that the :meth:`__hash__` values of str,
+   bytes and datetime objects are "salted" with an unpredictable random value.
+   Although they remain constant within an individual Python process, they
+   are not predictable between repeated invocations of Python.
+
+   This is intended to provide protection against a denial-of-service
+   caused by carefully-chosen inputs that exploit the worst case performance
+   of a dict lookup, O(n^2) complexity.  See:
+
+   http://www.ocert.org/advisories/ocert-2011-003.html
+
+   for details.
+
+   Changing hash values affects the order in which keys are retrieved from
+   a dict.  Although Python has never made guarantees about this ordering
+   (and it typically varies between 32-bit and 64-bit builds), enough
+   real-world code implicitly relies on this non-guaranteed behavior that
+   the randomization is disabled by default.
+
+   See also :envvar:`PYTHONHASHRANDOMIZATION`.
+
+
 .. cmdoption:: -s
 
Don't add user site directory to sys.path
@@ -435,6 +459,25 @@
import of source modules.
 
 
+.. envvar:: PYTHONHASHRANDOMIZATION
+
+   If this is set to a non-empty string it is equivalent to specifying the
+   :option:`-R` option.
+
+
+.. envvar:: PYTHONHASHSEED
+
+   If this is set, it is used as a fixed seed for generating the hash() of
+   the types covered by the :option:`-R` option (or its equivalent,
+   :envvar:`PYTHONHASHRANDOMIZATION`.
+
+   It is primarily intended for use in selftests for the interpreter, but
+   may perhaps be of use for reproducing a specific dict ordering.
+
+   It should be a decimal number in the range [0; 4294967295].  Specifying
+   the value 0 overrides the other setting, disabling the hash random salt.
+
+
 .. envvar:: PYTHONIOENCODING
 
Overrides the encoding used for stdin/stdout/stderr, in the syntax
diff -r 73dad4940b88 Include/object.h
--- a/Include/object.h  Fri Jan 20 11:23:02 2012 +
+++ b/Include/object.h  Sun Jan 29 20:19:11 2012 -0500
@@ -473,6 +473,12 @@
 PyAPI_FUNC(long) _Py_HashDouble(double);
 PyAPI_FUNC(long) _Py_HashPointer(void*);
 
+typedef struct {
+long prefix;
+long suffix;
+} _Py_HashSecret_t;
+PyAPI_DATA(_Py_HashSecret_t) _Py_HashSecret;
+
 /* Helper for passing objects to printf and the like */
 #define PyObject_REPR(obj) _PyUnicode_AsString(PyObject_Repr(obj))
 
diff -r 73dad4940b88 Include/pydebug.h
--- a/Include/pydebug.h Fri Jan 20 11:23:02 2012 +
+++ b/Include/pydebug.h Sun Jan 29 20:19:11 2012

[issue13703] Hash collision security issue

2012-01-29 Thread Martin v . Löwis

Martin v. Löwis  added the comment:

> Given PYTHONHASHSEED, what is the point of PYTHONHASHRANDOMIZATION?

How would you do what it does without it? I.e. how would you indicate
that it should randomize the seed, rather than fixing the seed value?

> On startup, python reads a config file with the seed (which defaults to zero).

-1 on configuration files that Python reads at startup (let alone in a
bugfix release).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-01-29 Thread Mark Shannon

Mark Shannon  added the comment:

Barry A. Warsaw wrote:
> Barry A. Warsaw  added the comment:
> 
> On Jan 28, 2012, at 07:26 PM, Dave Malcolm wrote:
> 
>> This turns out to pass without PYTHONHASHRANDOMIZATION in the
>> environment, and fail intermittently with it.
>>
>> Note that "make test" invokes the built python with "-E", so that it
>> ignores the setting of PYTHONHASHRANDOMIZATION in the environment.
>>
>> Barry, Benjamin: does fixing this bug require getting the full test
>> suite to pass with randomization enabled (and fixing the intermittent
>> failures due to ordering issues), or is it acceptable to "merely" have
>> full passes without randomizing the hashes?
> 
> I think we at least need to identify (to the best of our ability) the tests
> that fail and include them in release notes.  If they're easy to fix, we
> should fix them.  Maybe also open a bug report for each failure.

http://bugs.python.org/issue13903 causes even more tests to fail,
so I'm submitting bug reports for most of the failing tests already.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-01-29 Thread Jim Jewett

Jim Jewett  added the comment:

Given PYTHONHASHSEED, what is the point of PYTHONHASHRANDOMIZATION?

Alternative:

On startup, python reads a config file with the seed (which defaults to zero).

Add a function to write a random value to that config file for the next startup.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-01-29 Thread Barry A. Warsaw

Barry A. Warsaw  added the comment:

On Jan 28, 2012, at 07:26 PM, Dave Malcolm wrote:

>This turns out to pass without PYTHONHASHRANDOMIZATION in the
>environment, and fail intermittently with it.
>
>Note that "make test" invokes the built python with "-E", so that it
>ignores the setting of PYTHONHASHRANDOMIZATION in the environment.
>
>Barry, Benjamin: does fixing this bug require getting the full test
>suite to pass with randomization enabled (and fixing the intermittent
>failures due to ordering issues), or is it acceptable to "merely" have
>full passes without randomizing the hashes?

I think we at least need to identify (to the best of our ability) the tests
that fail and include them in release notes.  If they're easy to fix, we
should fix them.  Maybe also open a bug report for each failure.

I'm okay though with some tests failing in 2.6 with this environment variable
set.  We needn't go back and fix them in 2.6 (since we're in security-fix only
mode), but I'll bet you'll get almost the same set for 2.7 and there we
*should* fix them, even if it happens after the release.

>What do the buildbots do?

I'm not sure, but as long as the buildbots are green, I'm happy. :)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-01-28 Thread Dave Malcolm

Dave Malcolm  added the comment:

On Sat, 2012-01-28 at 23:56 +, Terry J. Reedy wrote:
> Terry J. Reedy  added the comment:
> 
> > I think you should check with randomization enabled, if only to see the
> > nature of the failures and if they are expected.
> 
> Including the list of when-enabled expected failures in the release 
> notes would help those who compile and test.

OK, though note that because it's random, I'll have to run it a few
times, and we'll see what shakes out.

Am running with:
$  make test TESTPYTHONOPTS=-R
leading to:
   ./python -E -bb -R ./Lib/test/regrtest.py -l 

BTW, I see:
  Testing with flags: sys.flags(debug=0, division_warning=0, inspect=0,
interactive=0, optimize=0, dont_write_bytecode=0, no_user_site=0,
no_site=0, ignore_environment=1, verbose=0, bytes_warning=2)

which doesn't list the new flag.  Should I add it to sys.flags?  (or
does anyone ever do tuple-unpacking of that PyStructSequence and thus
rely on the number of elements?)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-01-28 Thread Terry J. Reedy

Terry J. Reedy  added the comment:

> I think you should check with randomization enabled, if only to see the
> nature of the failures and if they are expected.

Including the list of when-enabled expected failures in the release 
notes would help those who compile and test.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-01-28 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

> Passes "make test" on this x86_64 Fedora 15 box, --with-pydebug, though
> that's without randomization enabled (it just does it within individual
> test cases that explicitly enable it).

I think you should check with randomization enabled, if only to see the
nature of the failures and if they are expected.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-01-28 Thread Dave Malcolm

Dave Malcolm  added the comment:

On Sat, 2012-01-28 at 20:05 +, Benjamin Peterson wrote:
> Benjamin Peterson  added the comment:
> 
> I think we don't need to mess with tests in 2.6/3.1, but everything should 
> pass under 2.7 and 3.2.

New version of the patch for 3.1
  optin-hash-randomization-for-3.1-dmalcolm-2012-01-28-001.patch

This version adds a command-line flag to enable hash-randomization: -R
(given that the -E flag disables env vars and thus disabled
PYTHONHASHRANDOMIZATION). See [1] below

[Is there a convenient way to check the length of the usage messages in
Modules/main.c?  I see this comment:
   /* Long usage message, split into parts < 512 bytes */ ]

I reworded the documentation somewhat based on input from Barry and
Antoine.

Also adds a NEWS item.

Passes "make test" on this x86_64 Fedora 15 box, --with-pydebug, though
that's without randomization enabled (it just does it within individual
test cases that explicitly enable it).

No performance testing done yet (though hopefully similar to that of
Victor's patch; see msg151078)

No idea of the impact on Windows users (I don't have a windows dev box).
It still has the stuff from Victor's patch described in msg151158.

How is this looking?
Dave

[1] IRC transcript concerning "-R" follows:
<__ap__> dmalcolm: IMO it would be simpler if there was only one env var
(preferably not too clumsy to type)
<__ap__> also, despite being neither barry nor gutworth, I think the
test suite *should* pass with randomized hashes
<__ap__> :)
 :)
<__ap__> also the failure you're having is a bit worrying, since
apparently it's not about dict ordering
 PYTHONHASHSEED exists mostly for selftesting (also for
compat, if you absolutely need to reproduce a specific random dict
ordering)
<__ap__> ok
<__ap__> if -E suppresses hash randomization, I think we should also add
a command-line flag
<__ap__> -R seems untaken
<__ap__> also it'll make things easier for Windows users, I think

--
Added file: 
http://bugs.python.org/file24353/optin-hash-randomization-for-3.1-dmalcolm-2012-01-28-001.patch

___
Python tracker 

___diff -r 73dad4940b88 Doc/reference/datamodel.rst
--- a/Doc/reference/datamodel.rst   Fri Jan 20 11:23:02 2012 +
+++ b/Doc/reference/datamodel.rst   Sat Jan 28 18:05:49 2012 -0500
@@ -1265,6 +1265,8 @@
inheritance of :meth:`__hash__` will be blocked, just as if :attr:`__hash__`
had been explicitly set to :const:`None`.
 
+   See also the :option:`-R` command-line option.
+
 
 .. method:: object.__bool__(self)
 
diff -r 73dad4940b88 Doc/using/cmdline.rst
--- a/Doc/using/cmdline.rst Fri Jan 20 11:23:02 2012 +
+++ b/Doc/using/cmdline.rst Sat Jan 28 18:05:49 2012 -0500
@@ -21,7 +21,7 @@
 
 When invoking Python, you may specify any of these options::
 
-python [-bBdEhiOsSuvVWx?] [-c command | -m module-name | script | - ] 
[args]
+python [-bBdEhiORsSuvVWx?] [-c command | -m module-name | script | - ] 
[args]
 
 The most common use case is, of course, a simple invocation of a script::
 
@@ -215,6 +215,30 @@
Discard docstrings in addition to the :option:`-O` optimizations.
 
 
+.. cmdoption:: -R
+
+   Turn on "hash randomization, so that the :meth:`__hash__` values of str,
+   bytes and datetime objects are "salted" with an unpredictable random value.
+   Although they remain constant within an individual Python process, they
+   are not predictable between repeated invocations of Python.
+
+   This is intended to provide protection against a denial-of-service
+   caused by carefully-chosen inputs that exploit the worst case performance
+   of a dict lookup, O(n^2) complexity.  See:
+
+   http://www.ocert.org/advisories/ocert-2011-003.html
+
+   for details.
+
+   Changing hash values affects the order in which keys are retrieved from
+   a dict.  Although Python has never made guarantees about this ordering
+   (and it typically varies between 32-bit and 64-bit builds), enough
+   real-world code implicitly relies on this non-guaranteed behavior that
+   the randomization is disabled by default.
+
+   See also :envvar:`PYTHONHASHRANDOMIZATION`.
+
+
 .. cmdoption:: -s
 
Don't add user site directory to sys.path
@@ -435,6 +459,25 @@
import of source modules.
 
 
+.. envvar:: PYTHONHASHRANDOMIZATION
+
+   If this is set to a non-empty string it is equivalent to specifying the
+   :option:`-R` option.
+
+
+.. envvar:: PYTHONHASHSEED
+
+   If this is set, it is used as a fixed seed for generating the hash() of
+   the types covered by the :option:`-R` option (or its equivalent,
+   :envvar:`PYTHONHASHRANDOMIZATION`.
+
+   It is primarily intended for use in selftests for the interpreter, but
+   may perhaps be of use for reproducing a specific dict ordering.
+
+   It should be a decimal number in the range [0; 4294967295].  Specifying
+   the value 0 overrides the other setting, disabling the 

[issue13703] Hash collision security issue

2012-01-28 Thread Benjamin Peterson

Benjamin Peterson  added the comment:

I think we don't need to mess with tests in 2.6/3.1, but everything should pass 
under 2.7 and 3.2.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



  1   2   3   4   >