Bugs item #1177468, was opened at 2005-04-06 01:03 Message generated for change (Settings changed) made by gbrandl You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1177468&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.4 Status: Closed Resolution: Accepted Priority: 5 Submitted By: Fazal Majid (majid) >Assigned to: Georg Brandl (gbrandl) Summary: random.py/os.urandom robustness Initial Comment: Python 2.4.1 now uses os.urandom() to seed the random number generator. This is mostly an improvement, but can lead to subtle regression bugs. os.urandom() will open /dev/urandom on demand, e.g. when random.Random.seed() is called, and keep it alive as os._urandomfd. It is standard programming practice for a daemon process to close file descriptors it has inherited from its parent process, and if it closes the file descriptor corresponding to os._urandomfd, the os module is blissfully unaware and the next time os.urandom() is called, it will try to read from a closed file descriptor (or worse, a new one opened since), with unpredictable results. My recommendation would be to make os.urandom() open /dev/urandom each time and not keep a persistent file descripto. This will be slightly slower, but more robust. I am not sure how I feel about a standard library function steal a file descriptor slot forever, specially when os.urandom() is probably going to be called only once in the lifetime of a program, when the random module is seeded. ---------------------------------------------------------------------- Comment By: Georg Brandl (birkenfeld) Date: 2005-07-04 17:17 Message: Logged In: YES user_id=1188172 Committed as Lib/os.py r1.87, r1.83.2.3. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2005-07-04 15:54 Message: Logged In: YES user_id=21627 The patch is fine, please apply - both to the trunk and to 2.4. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2005-07-04 14:56 Message: Logged In: YES user_id=80475 I'm on Windows so cannot be of much use on the patch review. It looks fine to me and I agree that something ought to be done. MvL reviewed and posted the original patch so he may be better able to comment on this patch. Alternatively, Peter A. can review/approve it. ---------------------------------------------------------------------- Comment By: Georg Brandl (birkenfeld) Date: 2005-07-04 14:26 Message: Logged In: YES user_id=1188172 Attaching a patch for Lib/os.py, fixing this on Unix. On Windows, a completely different method is used for urandom, so I think it is not relevant here. Please review. ---------------------------------------------------------------------- Comment By: Peter Åstrand (astrand) Date: 2005-07-04 13:01 Message: Logged In: YES user_id=344921 This bug is a major problem for us as well. This bug also breaks the subprocess module. Try, for example: subprocess.Popen(["ls"], close_fds=1, preexec_fn=lambda: os.urandom(4)) I agree with lcaamano; the library should NOT cache a file descriptor by default. Correctness and robustness is more important than speed. Has anyone really been able to verify that the performance with opening /dev/urandom each time is a problem? If it is, we could add a new function to the os module that activates the fd caching. If you have been calling this function, you have indicated that you are aware of the problem and will not close the cached fd. Legacy code will continue to function. ---------------------------------------------------------------------- Comment By: pocket_aces (pocket_aces) Date: 2005-06-08 22:10 Message: Logged In: YES user_id=1293510 We ran across this problem when we upgraded to 2.4. We use python embedded in a multi-threaded C++ process and use multiple subinterpreters. When a subinterpreter shuts down and the os module unloads, os._urandomfd is not closed because it is not a file object but rather just an integer. As such, after a while, our process had hundreds of dangling open file descriptors to /dev/urandom. I would think, at the very least, if this were a file object, it would be closed when the module was unloaded (the deallocator for fileobject closes the file). However, that doesn't make it any easier for those who are forking processes. Probably the best bet is to close it after reading the data. If you need a "high performance, multiple seek" urandom, just open /dev/urandom yourself. Either way, this bug is not invalid and needs to be addressed. My 2 cents.. --J ---------------------------------------------------------------------- Comment By: Luis P Caamano (lcaamano) Date: 2005-04-20 13:17 Message: Logged In: YES user_id=279987 Here's a reference: http://tinyurl.com/b8mk3 The relevant post: ============================================ On 25 Feb 2001 10:48:22 GMT Casper H.S. Dik - Network Security Engineer <[EMAIL PROTECTED]> wrote: | Solaris at various times used a cached /dev/zero fd both for mapping | thread stacks and even one for the runtime linker. | The runtime linker was mostly fine, but the thread library did have | problems with people closing fds. We since added MAP_ANON and no | longer require open("/dev/zero") . THe caaching of fds was gotten | rid of before that. | | There are valid reasons to close all fds; e.g., if you really don't | want to inherit and (you're a daemon and don't care). | | In most cases, though, the "close all" stuff performed by shells | and such at statup serves no purpose. (Other than causing more bugs ) So the dilemma is that closing fds can cause problems and leaving them open can cause problems, when a forked child does this. This seems to tell me that hiding fds in libraries and objects is a bad idea because processes need to know what is safe to close and/or what needs to be left open. ====================================== If the python library had some module or global list of opened file descriptors, then it would be OK to expect programs to keep those open across fork/exec. Something like: from os import opened_fds And then it would be no problem to skip those when closing fds. Otherwise, your nice daemon code that deals with _urandom_fd will break later on when somebody caches another fd somewhere else in the standard library. Also, the proposed os.daemonize() function that knows about its own fds would also work. Still, the most robust solution is not to cache open fds in the library or perhaps catch the EBADF exception and reopen. There are several solutions but closing this bug as invalid doesn't seem an appropriate one. ---------------------------------------------------------------------- Comment By: Luis P Caamano (lcaamano) Date: 2005-04-20 12:04 Message: Logged In: YES user_id=279987 I don't think so. One of the rules in libc, the standard C library, is that it cannot cache file descriptors for that same reason. This is not new. ---------------------------------------------------------------------- Comment By: Sean Reifschneider (jafo) Date: 2005-04-20 02:49 Message: Logged In: YES user_id=81797 Conversely, I would say that it's unreasonable to expect other things not to break if you go through and close file descriptors that the standard library has opened. ---------------------------------------------------------------------- Comment By: Luis P Caamano (lcaamano) Date: 2005-04-20 02:31 Message: Logged In: YES user_id=279987 Clearly broken? Hardly. Daemonization code is not the only place where it's recommend and standard practice to close file descriptors. It's unreasonable to expect python programs to keep track of all the possible file descriptors the python library might cache to make sure it doesn't close them in all the daemonization routines ... btw, contrary to standard unix programming practices. Are there any other file descriptors we should know about? ---------------------------------------------------------------------- Comment By: Sean Reifschneider (jafo) Date: 2005-04-19 22:27 Message: Logged In: YES user_id=81797 Perhaps the best way to resolve this would be for the standard library to provide code that either does the daemonize process, or at least does the closing of the sockets that may be done as part of the daemonize, that way it's clear what the "right" way is to do it. Thoughts? ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2005-04-19 20:05 Message: Logged In: YES user_id=6380 I recommend to close this as invalid. The daemonization code is clearly broken. ---------------------------------------------------------------------- Comment By: Fazal Majid (majid) Date: 2005-04-19 17:49 Message: Logged In: YES user_id=110477 Modifying the system os.py is not a good idea. A better work-around is to skip the /dev/urandom fd when you are closing all fds. This is the code we use: def close_fd(): # close all inherited file descriptors start_fd = 3 # Python 2.4.1 and later use /dev/urandom to seed the random module's RNG # a file descriptor is kept internally as os._urandomfd (created on demand # the first time os.urandom() is called), and should not be closed try: os.urandom(4) urandom_fd = getattr(os, '_urandomfd', None) except AttributeError: urandom_fd = None if '-close_fd' in sys.argv: start_fd = int(sys.argv[sys.argv.index('-close_fd') + 1]) for fd in range(start_fd, 256): if fd == urandom_fd: continue try: os.close(fd) except OSError: pass ---------------------------------------------------------------------- Comment By: Luis P Caamano (lcaamano) Date: 2005-04-19 16:53 Message: Logged In: YES user_id=279987 We're facing this problem. We're thinking of patching our os.py module to always open /dev/urandom on every call. Does anybody know if this would have any bad consequences other than the obvious system call overhead? BTW, here's the traceback we get. As you probably can guess, something called os.urandom before we closed all file descriptors in the daemonizing code and it then failed when os.urandom tried to use the cached fd. Traceback (most recent call last): File "/opt/race/share/sw/common/bin/dpmd", line 27, in ? dpmd().run() File "Linux/CommandLineApp.py", line 336, in run File "Linux/daemonbase.py", line 324, in main File "Linux/server.py", line 61, in addServices File "Linux/dpmd.py", line 293, in __init__ File "Linux/threadutils.py", line 44, in start File "Linux/xmlrpcd.py", line 165, in createThread File "Linux/threadutils.py", line 126, in __init__ File "/opt/race/share/sw/os/Linux_2.4_i686/python/lib/python2.4/t empfile.py", line 423, in NamedTemporaryFile dir = gettempdir() File "/opt/race/share/sw/os/Linux_2.4_i686/python/lib/python2.4/t empfile.py", line 262, in gettempdir tempdir = _get_default_tempdir() File "/opt/race/share/sw/os/Linux_2.4_i686/python/lib/python2.4/t empfile.py", line 185, in _get_default_tempdir namer = _RandomNameSequence() File "/opt/race/share/sw/os/Linux_2.4_i686/python/lib/python2.4/t empfile.py", line 121, in __init__ self.rng = _Random() File "/opt/race/share/sw/os/Linux_2.4_i686/python/lib/python2.4/r andom.py", line 96, in __init__ self.seed(x) File "/opt/race/share/sw/os/Linux_2.4_i686/python/lib/python2.4/r andom.py", line 110, in seed a = long(_hexlify(_urandom(16)), 16) File "/opt/race/share/sw/os/Linux_2.4_i686/python/lib/python2.4/ os.py", line 728, in urandom bytes += read(_urandomfd, n - len(bytes)) OSError : [Errno 9] Bad file descriptor ---------------------------------------------------------------------- Comment By: Sean Reifschneider (jafo) Date: 2005-04-06 09:04 Message: Logged In: YES user_id=81797 The child is a copy of the parent. Therefore, if in the parent you open a few file descriptors, those are the ones you should close in the child. That is exactly what I've done in the past when I forked a child, and it has worked very well. I suspect Stevens would make an exception to his guideline in the event that closing a file descriptor results in library routine failures. ---------------------------------------------------------------------- Comment By: Fazal Majid (majid) Date: 2005-04-06 07:27 Message: Logged In: YES user_id=110477 Unfortunately, catching exceptions is not sufficient - the file descriptor may have been reassigned. Fortunately in my case, to a socket which raised ENOSYS, but if it had been a normal file, this would have been much harder to trace because reading from it would cause weird errors for readers of the reassigned fd without triggering an exception in os.urandom() itself. As for not closing file descriptors you haven't opened yourself, if the process is the result of a vfork/exec (in my case Python processes started by a cluster manager, sort of like init), the child process has no clue what file descriptors, sockets or the like it has inherited from its parent process, and the safest course is to close them all. Indeed, that's what W. Richard Stevens recommends in "Advanced Programming for the UNIX environment". As far as I can tell, os.urandom() is used mostly to seed the RNG in random.py, and thus is not a high-drequency operation. It is going to be very hard to document this adequately for coders to defend against - in my case, the problem was being triggered by os.tempnam() from within Webware's PSP compiler. There are so many functions that depend on random (sometimes in non-obvious ways), you can't flag them all so their users know they should use urandomcleanup. One possible solution would be for os.py to offer a go_daemon() function that implements the fd closing, signal masking, process group and terminal disassociation required by true daemons. This function could take care of internal book-keeping like calling urandomcleanup. ---------------------------------------------------------------------- Comment By: Sean Reifschneider (jafo) Date: 2005-04-06 07:20 Message: Logged In: YES user_id=81797 Yeah, I was thinking the same thing. It doesn't address the consumed file handle, but it does address the "robustness" issue. It complicates the code, but should work. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2005-04-06 07:11 Message: Logged In: YES user_id=21627 To add robustness, it would be possible to catch read errors from _urandomfd, and try reopening it if it got somehow closed. ---------------------------------------------------------------------- Comment By: Sean Reifschneider (jafo) Date: 2005-04-06 03:11 Message: Logged In: YES user_id=81797 Just providing some feedback: I'm able to reproduce this. Importing random will cause this file descriptor to be called. Opening urandom on every call could lead to unacceptable syscall overhead for some. Perhaps there should be a "urandomcleanup" method that closes the file descriptor, and then random could get the bytes from urandom(), and clean up after itself? Personally, I only clean up the file descriptors I have allocated when I fork a new process. On the one hand I agree with you about sucking up a fd in the standard library, but on the other hand I'm thinking that you just shouldn't be closing file descriptors for stuff you'll be needing. That's my two cents on this bug. ---------------------------------------------------------------------- Comment By: Fazal Majid (majid) Date: 2005-04-06 01:06 Message: Logged In: YES user_id=110477 There are many modules that have a dependency on random, for instance os.tempnam(), and a program could well inadvertently use it before closing file descriptors. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1177468&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com