random number generator thread safety
I have questions about thread safety in the 'random' module. When using the random.Random class (be it Mersenne Twister or Wichmann-Hill based), is it sufficiently thread-safe (preserving entropy and guarding against attack) to just have each thread work with its own random.Random instance? Or do I need to wrap my calls to each instance's methods to use locks? Wichmann-Hill in particular has the warning in its .random() vulnerability; do I need to make an exception for that case? What about when using the random.SystemRandom class (be it based on /dev/urandom or CryptGenRandom based)? Should I be doing anything in my code to ensure that /dev/urandom, for example, is polled serially? (I thought I read somewhere that this was advisable). SystemRandom claims it has no state, which may be true at the Python class level, but is that guaranteed to be true for the underlying RNG? Also, side note: The doc string for random.SystemRandom says Not available on all systems (see os.urandom() for details) ...but os.urandom()'s doc string provides no details. Actually, os.urandom() and random.SystemRandom are available on all systems (they exist and can be called/instantiated), but os.urandom() can raise a NotImplementedError (any time it is called, in theory), therefore random.SystemRandom's methods can fail with a bubbled-up NotImplementedError (on non-win32 OSes, at least). At the very least, this should be documented. I'm not in a position to submit a patch, but will submit a documentation bug report if so advised. Thanks, Mike -- http://mail.python.org/mailman/listinfo/python-list
Re: random number generator thread safety
Mike Brown wrote: I have questions about thread safety in the 'random' module. When using the random.Random class (be it Mersenne Twister or Wichmann-Hill based), is it sufficiently thread-safe (preserving entropy and guarding against attack) to just have each thread work with its own random.Random instance? Or do I need to wrap my calls to each instance's methods to use locks? Wichmann-Hill in particular has the warning in its .random() vulnerability; do I need to make an exception for that case? Thread-safety has nothing to do with preserving entropy or guarding against attack. All of the entropy in an MT sequence is contained in the seed (upto 624 bytes) and that entropy is preserved through all subsequent calls. Create unique instances of MT generators when you want to be able to reproduce the sequence later. Nothing in the random module provides cryptographic guarantees. If you want crypto-strength, then use real encryption. Search SourceForge for patches that show how to use MD5 and other cryptographic hash functions as a basis for random number generation. As far as I'm concerned, there is no reason other than backwards compatability to use the Wichmann-Hill generator -- the Mersenne Twister is a much better generator. Raymond -- http://mail.python.org/mailman/listinfo/python-list
Re: random number generator thread safety
Raymond Hettinger [EMAIL PROTECTED] writes: Thread-safety has nothing to do with preserving entropy or guarding against attack. All of the entropy in an MT sequence is contained in the seed (upto 624 bytes) and that entropy is preserved through all subsequent calls. I think the concern is that there can be a thread switch after some intermediate result is computed but before the state is updated. That would mean two threads can get random numbers that are identical or anyway correlated. Whether that can happen with Python's MT, I don't know. Nothing in the random module provides cryptographic guarantees. If you want crypto-strength, then use real encryption. Search SourceForge for patches that show how to use MD5 and other cryptographic hash functions as a basis for random number generation. Or use os.urandom. -- http://mail.python.org/mailman/listinfo/python-list
Re: random number generator thread safety
Thread-safety has nothing to do with preserving entropy or guarding against attack. All of the entropy in an MT sequence is contained in the seed (upto 624 bytes) and that entropy is preserved through all subsequent calls. I think the concern is that there can be a thread switch after some intermediate result is computed but before the state is updated. The concern expressed by the OP was preserving entropy or guarding against attack. That would mean two threads can get random numbers that are identical or anyway correlated. Whether that can happen with Python's MT, I don't know. It can't. That is what the docs mean when they say that MT is thread-safe. -- http://mail.python.org/mailman/listinfo/python-list
Re: random number generator thread safety
Raymond Hettinger wrote: Mike Brown wrote: I have questions about thread safety in the 'random' module. When using the random.Random class (be it Mersenne Twister or Wichmann-Hill based), is it sufficiently thread-safe (preserving entropy and guarding against attack) to just have each thread work with its own random.Random instance? Or do I need to wrap my calls to each instance's methods to use locks? Wichmann-Hill in particular has the warning in its .random() vulnerability; do I need to make an exception for that case? Thread-safety has nothing to do with preserving entropy or guarding against attack. Well, the Wichmann-Hill implementation claims to be not thread safe but really, as Paul Rubin pointed out, it's about the risk (depending on the implementation) that when two threads have access to the same RNG, there is a loss of confidence in the 'randomness' of the results that each thread sees... e.g., can thread 2 manipulate the state of the RNG while thread 1 is using it? and can thread 2 see the same result as thread 1? I think both of these situations are alleviated just by using separate RNG instances, but I don't know enough to know if I should still be doing some blocking when calling some functions methods, such as os.urandom and random.SystemRandom's methods. Nothing in the random module provides cryptographic guarantees. I don't need cryptographic guarantees. I just need to expose some RNG functions in a multithreaded application that supports Python 2.2 and up (hence the concern about WH). [ I also intend to use stdlib and will work around whatever bugs/issues I know about, such as the undesirable NotImplementedError (i.e., I will fall back on Mersenne Twister if os.urandom or SystemRandom methods fail); the WH 'thread safety' issue (which affects my Py 2.2 users); the Py 2.3.0 'no milliseconds in default seed' issue (I will make sure whatever RNGs are available are better seeded by default); the Py 2.4.0-2.4.1 posix os.urandom filehandle caching/hold-open issue (I will use 2.4.2's os.urandom); and anything else that I should be concerned about. Since I'll be reimplementing os.urandom, I'll be making use of it wherever I can in 2.2-2.3, including as the preferred source over WH MT, and as a better seed source than system clock. ] I am asking for advice on how to mitigate the risks related to multithreading -- I don't the functions to be any less 'random' or more vulnerable than they would be in a single-threaded app. I don't know enough about the subject to know for sure whether the answer to should I block when polling a PRNG is no, or whether the answer to should I block when polling a system RNG (likely a PRNG seeded often from a hash of multiple sources) would be different. That's why I am asking here. The random module's docs suggest that I'll want to mitigate such risks by using my own instance(s) of random.Random, and I'm asking here if that alone is going to be sufficient -- I think that in Py 2.2 it will keep me from having to block when calling .random(), right? -- or if there are some risks that using separate random.Random instances doesn't take care of, and that would thus require me to use locks on more of my calls. I'm also asking whether the same answers would apply to the use of random.SystemRandom and os.urandom(); e.g., even if I don't need to block when calling random.Random methods, might it still be a good idea to block when accessing systemRandom/urandom? -Mike -- http://mail.python.org/mailman/listinfo/python-list