[issue8296] multiprocessing.Pool hangs when issuing KeyboardInterrupt

2010-08-27 Thread Greg Brockman
Changes by Greg Brockman g...@mit.edu: -- nosy: +gdb ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue8296 ___ ___ Python-bugs-list mailing list

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-08-27 Thread Greg Brockman
Greg Brockman g...@mit.edu added the comment: Hmm, a few notes. I have a bunch of nitpicks, but those can wait for a later iteration. (Just one style nit: I noticed a few unneeded whitespace changes... please try not to do that, as it makes the patch harder to read.) - Am I correct that you

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-08-27 Thread Greg Brockman
Greg Brockman g...@mit.edu added the comment: Ah, you're right--sorry, I had misread your code. I hadn't noticed the usage of the worker_pids. This explains what you're doing with the ACKs. Now, the problem is, I think doing it this way introduces some races (which is why I introduced the ACK

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-08-20 Thread Greg Brockman
Greg Brockman g...@mit.deu added the comment: Thanks for looking at it! Basically this patch requires the parent process to be able to send a message to a particular worker. As far as I can tell, the existing queues allow the children to send a message to the parent, or the parent to send

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-08-13 Thread Greg Brockman
Greg Brockman g...@mit.deu added the comment: I'll take another stab at this. In the attachment (assign-tasks.patch), I've combined a lot of the ideas presented on this issue, so thank you both for your input. Anyway: - The basic idea of the patch is to record the mapping of tasks

[issue9535] Pending signals are inherited by child processes

2010-08-06 Thread Greg Brockman
New submission from Greg Brockman g...@mit.deu: Upon os.fork(), pending signals are inherited by the child process. This can be demonstrated by pressing C-c in the middle of the following program: import os, sys, time, threading def do_fork(): while True: if not os.fork

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-27 Thread Greg Brockman
Greg Brockman g...@ksplice.com added the comment: You can't have a sensible default timeout, because the worker may be processing something important... In my case, the jobs are either functional or idempotent anyway, so aborting halfway through isn't a problem. In general though, I'm

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-27 Thread Greg Brockman
Greg Brockman g...@ksplice.com added the comment: Thanks for the comment. It's good to know what constraints we have to deal with. we can not, however, change the API. Does this include adding optional arguments? -- ___ Python tracker rep

[issue9334] argparse does not accept options taking arguments beginning with dash (regression from optparse)

2010-07-22 Thread Greg Brockman
Changes by Greg Brockman g...@ksplice.com: -- nosy: +gdb ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9334 ___ ___ Python-bugs-list mailing list

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-21 Thread Greg Brockman
Greg Brockman g...@ksplice.com added the comment: I thought the EOF errors would take care of that, at least this has been running in production on many platforms without that happening. There are a lot of corner cases here, some more pedantic than others. For example, suppose a child dies

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-20 Thread Greg Brockman
Greg Brockman g...@ksplice.com added the comment: At first glance, looks like there are a number of sites where you don't change the blocking calls to non-blocking calls (e.g. get()). Almost all of the get()s have the potential to be called when there is no possibility for them to terminate

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-15 Thread Greg Brockman
Greg Brockman g...@ksplice.com added the comment: Before I forget, looks like we also need to deal with the result from a worker being un-unpickleable: This is what my patch in bug 9244 does... Really? I could be misremembering, but I believe you deal with the case of the result being

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-15 Thread Greg Brockman
Greg Brockman g...@ksplice.com added the comment: Actually, the program you demonstrate is nonequivalent to the one I posted. The one I posted pickles just fine because 'bar' is a global name, but doesn't unpickle because it doesn't exist in the parent's namespace. (See http

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-15 Thread Greg Brockman
Greg Brockman g...@ksplice.com added the comment: Started looking at your patch. It seems to behave reasonably, although it still doesn't catch all of the failure cases. In particular, as you note, crashed jobs won't be noticed until the pool shuts down... but if you make a blocking call

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-14 Thread Greg Brockman
Greg Brockman g...@ksplice.com added the comment: Before I forget, looks like we also need to deal with the result from a worker being un-unpickleable: #!/usr/bin/env python import multiprocessing def foo(x): global bar def bar(x): pass return bar p = multiprocessing.Pool(1) p.apply

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-13 Thread Greg Brockman
Greg Brockman g...@ksplice.com added the comment: What kind of errors are you having that makes the get() call fail? Try running the script I posted. It will fail with an AttributeError (raised during unpickling) and hang. I'll note that the particular issues that I've run into in practice

[issue9244] multiprocessing.pool: Worker crashes if result can't be encoded

2010-07-13 Thread Greg Brockman
Greg Brockman g...@ksplice.com added the comment: This looks pretty reasonable to my untrained eye. I successfully applied and ran the test suite. To be clear, the errback change and the unpickleable result change are actually orthogonal, right? Anyway, I'm not really familiar

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-13 Thread Greg Brockman
Greg Brockman g...@ksplice.com added the comment: While looking at your patch in issue 9244, I realized that my code fails to handle an unpickleable task, as in: #!/usr/bin/env python import multiprocessing foo = lambda x: x p = multiprocessing.Pool(1) p.apply(foo, [1]) This should be fixed

[issue9207] multiprocessing occasionally spits out exception during shutdown (_handle_workers)

2010-07-12 Thread Greg Brockman
Greg Brockman g...@ksplice.com added the comment: With pool.py:272 commented out, running about 50k iterations, I saw 4 tracebacks giving an exception on pool.py:152. So this seems to imply the race does exist (i.e. that the thread is in _maintain_pool rather than time.sleep when shutdown

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-12 Thread Greg Brockman
Greg Brockman g...@ksplice.com added the comment: Thanks much for taking a look at this! why are you terminating the second pass after finding a failed process? Unfortunately, if you've lost a worker, you are no longer guaranteed that cache will eventually be empty. In particular, you may

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-12 Thread Greg Brockman
Greg Brockman g...@ksplice.com added the comment: For processes disappearing (if that can at all happen), we could solve that by storing the jobs a process has accepted (started working on), so if a worker process is lost, we can mark them as failed too. Sure, this would be reasonable

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-10 Thread Greg Brockman
Greg Brockman g...@ksplice.com added the comment: Cool, thanks. I'll note that with this patch applied, using the test program from 9207 I consistently get the following exception: Exception in thread Thread-1 (most likely raised during interpreter shutdown): Traceback (most recent call last

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-10 Thread Greg Brockman
Greg Brockman g...@ksplice.com added the comment: What about just catching the exception? See e.g. the attached patch. (Disclaimer: not heavily tested). -- Added file: http://bugs.python.org/file17934/shutdown.patch ___ Python tracker rep

[issue4106] multiprocessing occasionally spits out exception during shutdown

2010-07-08 Thread Greg Brockman
Greg Brockman g...@ksplice.com added the comment: For what it's worth, I think I have a simpler reproducer of this issue. Using freshly-compiled python-from-trunk (as well as multiprocessing-from-trunk), I get tracebacks from the following about 30% of the time: import multiprocessing, time

[issue4106] multiprocessing occasionally spits out exception during shutdown

2010-07-08 Thread Greg Brockman
Greg Brockman g...@ksplice.com added the comment: I'm on Ubuntu 10.04, 64 bit. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4106

[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

2010-07-08 Thread Greg Brockman
New submission from Greg Brockman g...@ksplice.com: I have recently begun using multiprocessing for a variety of batch jobs. It's a great library, and it's been quite useful. However, I have been bitten several times by situations where a worker process in a Pool will unexpectedly die

[issue9207] multiprocessing occasionally spits out exception during shutdown

2010-07-08 Thread Greg Brockman
New submission from Greg Brockman g...@ksplice.com: On Ubuntu 10.04, using freshly-compiled python-from-trunk (as well as multiprocessing-from-trunk), I get tracebacks from the following about 30% of the time: import multiprocessing, time def foo(x

[issue4106] multiprocessing occasionally spits out exception during shutdown

2010-07-08 Thread Greg Brockman
Greg Brockman g...@ksplice.com added the comment: Sure thing. See http://bugs.python.org/issue9207. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4106

[issue9207] multiprocessing occasionally spits out exception during shutdown (_handle_workers)

2010-07-08 Thread Greg Brockman
Greg Brockman g...@ksplice.com added the comment: That's likely a mistake on my part. I'm not observing this using the stock version of multiprocessing on my Ubuntu machine(after running O(100) times). I do, however, observe it when using either python2.7 or python2.6 with multiprocessing

[issue9207] multiprocessing occasionally spits out exception during shutdown (_handle_workers)

2010-07-08 Thread Greg Brockman
Greg Brockman g...@ksplice.com added the comment: No, I'm not using the Google code backport. To be clear, I've tried testing this with two versions of multiprocessing: - multiprocessing-from-trunk (r82645): I get these exceptions with ~40% frequency - multiprocessing from Ubuntu 10.04

[issue9207] multiprocessing occasionally spits out exception during shutdown (_handle_workers)

2010-07-08 Thread Greg Brockman
Greg Brockman g...@ksplice.com added the comment: Wait - so, you are pulling svn trunk, compiling and running your test with the built python executable? Yes. I initially observed this issue while using 10.04's Python (2.6.5), but wanted to make sure it wasn't fixed by using a newer

[issue9207] multiprocessing occasionally spits out exception during shutdown (_handle_workers)

2010-07-08 Thread Greg Brockman
Greg Brockman g...@ksplice.com added the comment: Yeah, I've just taken a checkout from trunk, ran './configure make make install', and reproduced on: - Ubuntu 10.04 32-bit - Ubuntu 9.04 32-bit -- ___ Python tracker rep...@bugs.python.org http

[issue9207] multiprocessing occasionally spits out exception during shutdown (_handle_workers)

2010-07-08 Thread Greg Brockman
Greg Brockman g...@ksplice.com added the comment: With the line commented out, I no longer see any exceptions. Although, if I understand what's going on, there still a (much rarer) possibility of an exception, right? I guess in the common case, the worker_handler is in the sleep when