[issue22393] multiprocessing.Pool shouldn't hang forever if a worker process dies unexpectedly

2021-10-17 Thread Myles Steinhauser


Change by Myles Steinhauser :


--
nosy: +myles.steinhauser

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22393] multiprocessing.Pool shouldn't hang forever if a worker process dies unexpectedly

2021-08-30 Thread Ruairidh MacLeod


Change by Ruairidh MacLeod :


--
nosy: +rkm

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22393] multiprocessing.Pool shouldn't hang forever if a worker process dies unexpectedly

2021-08-18 Thread Jon Clucas


Change by Jon Clucas :


--
nosy: +shnizzedy

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22393] multiprocessing.Pool shouldn't hang forever if a worker process dies unexpectedly

2021-04-12 Thread STINNER Victor


Change by STINNER Victor :


--
nosy:  -vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22393] multiprocessing.Pool shouldn't hang forever if a worker process dies unexpectedly

2021-04-11 Thread Marko


Marko  added the comment:

Somewhat related issue43806 with asyncio.StreamReader

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22393] multiprocessing.Pool shouldn't hang forever if a worker process dies unexpectedly

2021-04-11 Thread Marko


Marko  added the comment:

I've created issue43805. I think it would be better to have universal solution. 
And not specific ones, like in issue9205.

Haven't checked the PRs, though.

--
nosy: +kormang

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22393] multiprocessing.Pool shouldn't hang forever if a worker process dies unexpectedly

2019-09-13 Thread Davin Potts


Change by Davin Potts :


--
pull_requests: +15722
pull_request: https://github.com/python/cpython/pull/16103

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22393] multiprocessing.Pool shouldn't hang forever if a worker process dies unexpectedly

2019-09-10 Thread STINNER Victor


STINNER Victor  added the comment:

I just marked bpo-38084 as duplicate of this issue. I manually merged the nosy 
lists.

--
nosy: +vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22393] multiprocessing.Pool shouldn't hang forever if a worker process dies unexpectedly

2019-01-17 Thread Chris Markiewicz


Chris Markiewicz  added the comment:

Just a bump to note that the PR (10441) is ready for another round of review.

--
nosy: +cjmarkie

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22393] multiprocessing.Pool shouldn't hang forever if a worker process dies unexpectedly

2018-11-12 Thread Oscar Esteban


Oscar Esteban  added the comment:

I tried to reuse as much as I could from the patch, but it didn't solve the 
issue at first.

I have changed the responsibility of identifying and prescribing a solution 
when a worker got killed. In the proposed patch, the thread handling results 
(i.e. tasks queued by one worker as done) was responsible. In the PR, the 
responsibility is reassigned to the thread handling workers (since, basically, 
one or more workers suddenly die).

The patch defined a new BROKEN state that was assigned to the results handler 
thread. I transferred this behavior to the worker handler thread. But, I'm 
guessing that the BROKEN state should be assigned to the Pool object instead, 
to be fully semantic. Although that would require passing the reference to the 
object around and complicate unnecessarily the implementation. Happy to 
reconsider though.

I added three tests, one that was present with the patch, a variation of it 
adding some wait before killing the worker, and the one that Francis Bolduc 
posted here (https://bugs.python.org/issue22393#msg294968).

Please let me know whether any conversation about this bug should take place in 
GitHub, with the PR instead of here.

Thanks a lot for the guidance, Antoine.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22393] multiprocessing.Pool shouldn't hang forever if a worker process dies unexpectedly

2018-11-09 Thread Oscar Esteban


Change by Oscar Esteban :


--
pull_requests: +9713
stage: needs patch -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22393] multiprocessing.Pool shouldn't hang forever if a worker process dies unexpectedly

2018-11-06 Thread Antoine Pitrou


Antoine Pitrou  added the comment:

You should start from master.  Bugfixes can backported afterwards if 
appropriate.  Thanks!

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22393] multiprocessing.Pool shouldn't hang forever if a worker process dies unexpectedly

2018-11-06 Thread Oscar Esteban


Oscar Esteban  added the comment:

Hi Antoine,

I may take a stab at it. Before I start, should I branch from master or from 
3.7.1 (as 3.7 is still accepting bugfixes).

Best,
Oscar

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22393] multiprocessing.Pool shouldn't hang forever if a worker process dies unexpectedly

2018-04-23 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

Oscar, the patch posted here needs updating for the latest git master.

If you want to avoid this issue, you can also use concurrent.futures where the 
issue is fixed.

--
stage:  -> needs patch
versions: +Python 3.8 -Python 3.5

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22393] multiprocessing.Pool shouldn't hang forever if a worker process dies unexpectedly

2018-04-23 Thread Oscar Esteban

Oscar Esteban  added the comment:

We use multiprocessing to parallelize many tasks that run either python code or 
call subprocess.run that are memory hungry.

At times the OOM Killer kicks in. When one of the workers is killed, the queue 
never returns an error for the task being run by the worker.

Are there any plans to merge the patch proposed in this issue?

--
nosy: +Oscar Esteban

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22393] multiprocessing.Pool shouldn't hang forever if a worker process dies unexpectedly

2017-06-01 Thread Francis Bolduc

Francis Bolduc added the comment:

This problem also happens simply by calling sys.exit from one of the child 
processes.

The following script exhibits the problem:

import multiprocessing
import sys
def test(value):
if value:
sys.exit(123)
if __name__ == '__main__':
pool = multiprocessing.Pool(4)
cases = [0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]
pool.map(test, cases)

--
nosy: +Francis Bolduc

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22393] multiprocessing.Pool shouldn't hang forever if a worker process dies unexpectedly

2015-10-11 Thread Davin Potts

Changes by Davin Potts :


--
nosy: +davin

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22393] multiprocessing.Pool shouldn't hang forever if a worker process dies unexpectedly

2015-09-16 Thread Brian Boonstra

Changes by Brian Boonstra :


--
nosy: +brianboonstra

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22393] multiprocessing.Pool shouldn't hang forever if a worker process dies unexpectedly

2014-09-12 Thread Chris Rebert

Changes by Chris Rebert pyb...@rebertia.com:


--
nosy: +cvrebert

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22393
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22393] multiprocessing.Pool shouldn't hang forever if a worker process dies unexpectedly

2014-09-11 Thread Dan O'Reilly

New submission from Dan O'Reilly:

This is essentially a dupe of issue9205, but it was suggested I open a new 
issue, since that one ended up being used to fix this same problem in 
concurrent.futures, and was subsequently closed.

Right now, should a worker process in a Pool unexpectedly get terminated while 
a blocking Pool method is running (e.g. apply, map), the method will hang 
forever. This isn't a normal occurrence, but it does occasionally happen 
(either because someone  sends a SIGTERM, or because of a bug in the 
interpreter or a C-extension). It would be preferable for multiprocessing to 
follow the lead of concurrent.futures.ProcessPoolExecutor when this happens, 
and abort all running tasks and close down the Pool.

Attached is a patch that implements this behavior. Should a process in a Pool 
unexpectedly exit (meaning, *not* because of hitting the maxtasksperchild 
limit), the Pool will be closed/terminated and all cached/running tasks will 
raise a BrokenProcessPool exception. These changes also prevent the Pool from 
going into a bad state if the initializer function raises an exception 
(previously, the pool would end up infinitely starting new processes, which 
would immediately die because of the exception).

One concern with the patch: The way timings are altered with these changes, the 
Pool seems to be particularly susceptible to issue6721 in certain cases. If 
processes in the Pool are being restarted due to maxtasksperchild just as the 
worker is being closed or joined, there is a chance the worker will be forked 
while some of the debug logging inside of Pool is running (and holding locks on 
either sys.stdout or sys.stderr). When this happens, the worker deadlocks on 
startup, which will hang the whole program. I believe the current 
implementation is susceptible to this as well, but I could reproduce it much 
more consistently with this patch. I think its rare enough in practice that it 
shouldn't prevent the patch from being accepted, but thought I should point it 
out. 

(I do think issue6721 should be addressed, or at the very least internal  I/O 
locks should always reset after forking.)

--
components: Library (Lib)
files: multiproc_broken_pool.diff
keywords: patch
messages: 226805
nosy: dan.oreilly, jnoller, pitrou, sbt
priority: normal
severity: normal
status: open
title: multiprocessing.Pool shouldn't hang forever if a worker process dies 
unexpectedly
type: enhancement
versions: Python 3.5
Added file: http://bugs.python.org/file36603/multiproc_broken_pool.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22393
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com