random number generator thread safety

2005-11-08 Thread Mike Brown
I have questions about thread safety in the 'random' module.

When using the random.Random class (be it Mersenne Twister or Wichmann-Hill 
based), is it sufficiently thread-safe (preserving entropy and guarding 
against attack) to just have each thread work with its own random.Random 
instance? Or do I need to wrap my calls to each instance's methods to use 
locks? Wichmann-Hill in particular has the warning in its .random() 
vulnerability; do I need to make an exception for that case?

What about when using the random.SystemRandom class (be it based on 
/dev/urandom or CryptGenRandom based)? Should I be doing anything in my code 
to ensure that /dev/urandom, for example, is polled serially? (I thought I 
read somewhere that this was advisable). SystemRandom claims it has no state, 
which may be true at the Python class level, but is that guaranteed to be true 
for the underlying RNG?

Also, side note: The doc string for random.SystemRandom says Not available on 
all systems (see os.urandom() for details) ...but os.urandom()'s doc string 
provides no details. Actually, os.urandom() and random.SystemRandom are 
available on all systems (they exist and can be called/instantiated), but 
os.urandom() can raise a NotImplementedError (any time it is called, in 
theory), therefore random.SystemRandom's methods can fail with a bubbled-up 
NotImplementedError (on non-win32 OSes, at least). At the very least, this 
should be documented. I'm not in a position to submit a patch, but will submit 
a documentation bug report if so advised.

Thanks,
Mike
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: random number generator thread safety

2005-11-08 Thread Raymond Hettinger
Mike Brown wrote:
 I have questions about thread safety in the 'random' module.

 When using the random.Random class (be it Mersenne Twister or Wichmann-Hill
 based), is it sufficiently thread-safe (preserving entropy and guarding
 against attack) to just have each thread work with its own random.Random
 instance? Or do I need to wrap my calls to each instance's methods to use
 locks? Wichmann-Hill in particular has the warning in its .random()
 vulnerability; do I need to make an exception for that case?

Thread-safety has nothing to do with preserving entropy or guarding
against attack.  All of the entropy in an MT sequence is contained in
the seed (upto 624 bytes) and that entropy is preserved through all
subsequent calls.

Create unique instances of MT generators when you want to be able to
reproduce the sequence later.

Nothing in the random module provides cryptographic guarantees.  If you
want crypto-strength, then use real encryption.  Search SourceForge for
patches that show how to use MD5 and other cryptographic hash functions
as a basis for random number generation.

As far as I'm concerned, there is no reason other than backwards
compatability to use the Wichmann-Hill generator -- the Mersenne
Twister is a much better generator.



Raymond

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: random number generator thread safety

2005-11-08 Thread Paul Rubin
Raymond Hettinger [EMAIL PROTECTED] writes:
 Thread-safety has nothing to do with preserving entropy or guarding
 against attack.  All of the entropy in an MT sequence is contained in
 the seed (upto 624 bytes) and that entropy is preserved through all
 subsequent calls.

I think the concern is that there can be a thread switch after some
intermediate result is computed but before the state is updated.  That
would mean two threads can get random numbers that are identical or
anyway correlated.  Whether that can happen with Python's MT, I don't
know.

 Nothing in the random module provides cryptographic guarantees.  If you
 want crypto-strength, then use real encryption.  Search SourceForge for
 patches that show how to use MD5 and other cryptographic hash functions
 as a basis for random number generation.

Or use os.urandom.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: random number generator thread safety

2005-11-08 Thread Raymond Hettinger
  Thread-safety has nothing to do with preserving entropy or guarding
  against attack.  All of the entropy in an MT sequence is contained in
  the seed (upto 624 bytes) and that entropy is preserved through all
  subsequent calls.

 I think the concern is that there can be a thread switch after some
 intermediate result is computed but before the state is updated.

The concern expressed by the OP was preserving entropy or guarding
against attack.

  That
 would mean two threads can get random numbers that are identical or
 anyway correlated.  Whether that can happen with Python's MT, I don't
 know.

It can't.  That is what the docs mean when they say that MT is
thread-safe.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: random number generator thread safety

2005-11-08 Thread Mike Brown
Raymond Hettinger wrote:
 Mike Brown wrote:
  I have questions about thread safety in the 'random' module.
 
  When using the random.Random class (be it Mersenne Twister or Wichmann-Hill
  based), is it sufficiently thread-safe (preserving entropy and guarding
  against attack) to just have each thread work with its own random.Random
  instance? Or do I need to wrap my calls to each instance's methods to use
  locks? Wichmann-Hill in particular has the warning in its .random()
  vulnerability; do I need to make an exception for that case?
 
 Thread-safety has nothing to do with preserving entropy or guarding
 against attack.

Well, the Wichmann-Hill implementation claims to be not thread safe but 
really, as Paul Rubin pointed out, it's about the risk (depending on the 
implementation) that when two threads have access to the same RNG, there is a 
loss of confidence in the 'randomness' of the results that each thread sees... 
e.g., can thread 2 manipulate the state of the RNG while thread 1 is using it? 
and can thread 2 see the same result as thread 1?

I think both of these situations are alleviated just by using separate RNG 
instances, but I don't know enough to know if I should still be doing some
blocking when calling some functions  methods, such as os.urandom and
random.SystemRandom's methods.

 Nothing in the random module provides cryptographic guarantees.

I don't need cryptographic guarantees. I just need to expose some RNG 
functions in a multithreaded application that supports Python 2.2 and up 
(hence the concern about WH).

[ I also intend to use stdlib and will work around whatever bugs/issues I know 
about, such as the undesirable NotImplementedError (i.e., I will fall back on 
Mersenne Twister if os.urandom or SystemRandom methods fail); the WH 'thread 
safety' issue (which affects my Py 2.2 users); the Py 2.3.0 'no milliseconds 
in default seed' issue (I will make sure whatever RNGs are available are 
better seeded by default); the Py 2.4.0-2.4.1 posix os.urandom filehandle 
caching/hold-open issue (I will use 2.4.2's os.urandom); and anything else 
that I should be concerned about. Since I'll be reimplementing os.urandom, 
I'll be making use of it wherever I can in 2.2-2.3, including as the preferred 
source over WH  MT, and as a better seed source than system clock. ]

I am asking for advice on how to mitigate the risks related to multithreading 
-- I don't the functions to be any less 'random' or more vulnerable than they 
would be in a single-threaded app. I don't know enough about the subject to 
know for sure whether the answer to should I block when polling a PRNG is 
no, or whether the answer to should I block when polling a system RNG 
(likely a PRNG seeded often from a hash of multiple sources) would be 
different. That's why I am asking here.

The random module's docs suggest that I'll want to mitigate such risks by 
using my own instance(s) of random.Random, and I'm asking here if that alone 
is going to be sufficient -- I think that in Py 2.2 it will keep me from 
having to block when calling .random(), right? -- or if there are some risks 
that using separate random.Random instances doesn't take care of, and that 
would thus require me to use locks on more of my calls. I'm also asking 
whether the same answers would apply to the use of random.SystemRandom and 
os.urandom(); e.g., even if I don't need to block when calling random.Random 
methods, might it still be a good idea to block when accessing 
systemRandom/urandom?

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list