Bugs item #1177468, was opened at 2005-04-05 21:03 Message generated for change (Comment added) made by gvanrossum You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1177468&group_id=5470
Category: Python Library Group: Python 2.4 Status: Open Resolution: None Priority: 5 Submitted By: Fazal Majid (majid) Assigned to: Nobody/Anonymous (nobody) Summary: random.py/os.urandom robustness Initial Comment: Python 2.4.1 now uses os.urandom() to seed the random number generator. This is mostly an improvement, but can lead to subtle regression bugs. os.urandom() will open /dev/urandom on demand, e.g. when random.Random.seed() is called, and keep it alive as os._urandomfd. It is standard programming practice for a daemon process to close file descriptors it has inherited from its parent process, and if it closes the file descriptor corresponding to os._urandomfd, the os module is blissfully unaware and the next time os.urandom() is called, it will try to read from a closed file descriptor (or worse, a new one opened since), with unpredictable results. My recommendation would be to make os.urandom() open /dev/urandom each time and not keep a persistent file descripto. This will be slightly slower, but more robust. I am not sure how I feel about a standard library function steal a file descriptor slot forever, specially when os.urandom() is probably going to be called only once in the lifetime of a program, when the random module is seeded. ---------------------------------------------------------------------- >Comment By: Guido van Rossum (gvanrossum) Date: 2005-04-19 16:05 Message: Logged In: YES user_id=6380 I recommend to close this as invalid. The daemonization code is clearly broken. ---------------------------------------------------------------------- Comment By: Fazal Majid (majid) Date: 2005-04-19 13:49 Message: Logged In: YES user_id=110477 Modifying the system os.py is not a good idea. A better work-around is to skip the /dev/urandom fd when you are closing all fds. This is the code we use: def close_fd(): # close all inherited file descriptors start_fd = 3 # Python 2.4.1 and later use /dev/urandom to seed the random module's RNG # a file descriptor is kept internally as os._urandomfd (created on demand # the first time os.urandom() is called), and should not be closed try: os.urandom(4) urandom_fd = getattr(os, '_urandomfd', None) except AttributeError: urandom_fd = None if '-close_fd' in sys.argv: start_fd = int(sys.argv[sys.argv.index('-close_fd') + 1]) for fd in range(start_fd, 256): if fd == urandom_fd: continue try: os.close(fd) except OSError: pass ---------------------------------------------------------------------- Comment By: Luis P Caamano (lcaamano) Date: 2005-04-19 12:53 Message: Logged In: YES user_id=279987 We're facing this problem. We're thinking of patching our os.py module to always open /dev/urandom on every call. Does anybody know if this would have any bad consequences other than the obvious system call overhead? BTW, here's the traceback we get. As you probably can guess, something called os.urandom before we closed all file descriptors in the daemonizing code and it then failed when os.urandom tried to use the cached fd. Traceback (most recent call last): File "/opt/race/share/sw/common/bin/dpmd", line 27, in ? dpmd().run() File "Linux/CommandLineApp.py", line 336, in run File "Linux/daemonbase.py", line 324, in main File "Linux/server.py", line 61, in addServices File "Linux/dpmd.py", line 293, in __init__ File "Linux/threadutils.py", line 44, in start File "Linux/xmlrpcd.py", line 165, in createThread File "Linux/threadutils.py", line 126, in __init__ File "/opt/race/share/sw/os/Linux_2.4_i686/python/lib/python2.4/t empfile.py", line 423, in NamedTemporaryFile dir = gettempdir() File "/opt/race/share/sw/os/Linux_2.4_i686/python/lib/python2.4/t empfile.py", line 262, in gettempdir tempdir = _get_default_tempdir() File "/opt/race/share/sw/os/Linux_2.4_i686/python/lib/python2.4/t empfile.py", line 185, in _get_default_tempdir namer = _RandomNameSequence() File "/opt/race/share/sw/os/Linux_2.4_i686/python/lib/python2.4/t empfile.py", line 121, in __init__ self.rng = _Random() File "/opt/race/share/sw/os/Linux_2.4_i686/python/lib/python2.4/r andom.py", line 96, in __init__ self.seed(x) File "/opt/race/share/sw/os/Linux_2.4_i686/python/lib/python2.4/r andom.py", line 110, in seed a = long(_hexlify(_urandom(16)), 16) File "/opt/race/share/sw/os/Linux_2.4_i686/python/lib/python2.4/ os.py", line 728, in urandom bytes += read(_urandomfd, n - len(bytes)) OSError : [Errno 9] Bad file descriptor ---------------------------------------------------------------------- Comment By: Sean Reifschneider (jafo) Date: 2005-04-06 05:04 Message: Logged In: YES user_id=81797 The child is a copy of the parent. Therefore, if in the parent you open a few file descriptors, those are the ones you should close in the child. That is exactly what I've done in the past when I forked a child, and it has worked very well. I suspect Stevens would make an exception to his guideline in the event that closing a file descriptor results in library routine failures. ---------------------------------------------------------------------- Comment By: Fazal Majid (majid) Date: 2005-04-06 03:27 Message: Logged In: YES user_id=110477 Unfortunately, catching exceptions is not sufficient - the file descriptor may have been reassigned. Fortunately in my case, to a socket which raised ENOSYS, but if it had been a normal file, this would have been much harder to trace because reading from it would cause weird errors for readers of the reassigned fd without triggering an exception in os.urandom() itself. As for not closing file descriptors you haven't opened yourself, if the process is the result of a vfork/exec (in my case Python processes started by a cluster manager, sort of like init), the child process has no clue what file descriptors, sockets or the like it has inherited from its parent process, and the safest course is to close them all. Indeed, that's what W. Richard Stevens recommends in "Advanced Programming for the UNIX environment". As far as I can tell, os.urandom() is used mostly to seed the RNG in random.py, and thus is not a high-drequency operation. It is going to be very hard to document this adequately for coders to defend against - in my case, the problem was being triggered by os.tempnam() from within Webware's PSP compiler. There are so many functions that depend on random (sometimes in non-obvious ways), you can't flag them all so their users know they should use urandomcleanup. One possible solution would be for os.py to offer a go_daemon() function that implements the fd closing, signal masking, process group and terminal disassociation required by true daemons. This function could take care of internal book-keeping like calling urandomcleanup. ---------------------------------------------------------------------- Comment By: Sean Reifschneider (jafo) Date: 2005-04-06 03:20 Message: Logged In: YES user_id=81797 Yeah, I was thinking the same thing. It doesn't address the consumed file handle, but it does address the "robustness" issue. It complicates the code, but should work. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2005-04-06 03:11 Message: Logged In: YES user_id=21627 To add robustness, it would be possible to catch read errors from _urandomfd, and try reopening it if it got somehow closed. ---------------------------------------------------------------------- Comment By: Sean Reifschneider (jafo) Date: 2005-04-05 23:11 Message: Logged In: YES user_id=81797 Just providing some feedback: I'm able to reproduce this. Importing random will cause this file descriptor to be called. Opening urandom on every call could lead to unacceptable syscall overhead for some. Perhaps there should be a "urandomcleanup" method that closes the file descriptor, and then random could get the bytes from urandom(), and clean up after itself? Personally, I only clean up the file descriptors I have allocated when I fork a new process. On the one hand I agree with you about sucking up a fd in the standard library, but on the other hand I'm thinking that you just shouldn't be closing file descriptors for stuff you'll be needing. That's my two cents on this bug. ---------------------------------------------------------------------- Comment By: Fazal Majid (majid) Date: 2005-04-05 21:06 Message: Logged In: YES user_id=110477 There are many modules that have a dependency on random, for instance os.tempnam(), and a program could well inadvertently use it before closing file descriptors. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1177468&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com