[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor
Thomas Moreau added the comment: I did GH 19788 with a few modifications. There is only one lock that seems to mater for the perf, and I actually added one other (the one in _python_exit, which necessitate another bug fix for fork context). I did not benchmark to see if it was worth it in term of perf. -- ___ Python tracker <https://bugs.python.org/issue39995> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor
Change by Thomas Moreau : -- pull_requests: +19111 pull_request: https://github.com/python/cpython/pull/19788 ___ Python tracker <https://bugs.python.org/issue39995> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor
Thomas Moreau added the comment: I think this is a reasonable way to move on.Some of the locks can probably be removed but this needs careful investigation and in the mean time, it hinders everyone. Thanks victor for the fast fix up! To me, an interesting observation is that the failure seems to only happen when the executor is in broken state. If we can find a way to adapt the behavior to be more conservative on these states (which are not impacting perf) that would be nice. I will try to look at it today and see if I can remove some of the locks while still not failing with Victor's patch. We can probably remove the lock on `clear` safely. For the others, it might be more complex. -- ___ Python tracker <https://bugs.python.org/issue39995> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor
Thomas Moreau added the comment: Sorry I just saw this. It seems that I introduced this regression. One of the goal of having a `ThreadWakeup` and not a `SimpleQueue` is to avoid using locks that can hinder the performance of the executor. I don't remember the exact details but I think I did some benchmark and it was giving lower performances for large set of small tasks (not so sure anymore). If a fully synchronous method is chosen, maybe it is safer to rely on a `SimpleQueue` as it will be lower maintenance. If the solution proposed by aeros is chosen, the event can probably be replaced by a lock no? It would be easier to understand I guess. >From the failures, it seems to be a race condition between `shutdown` and >`terminate_broken`. The race condition should not be possible in `submit` as >in principle, the `join_executor_internals` is only called when no new task >can be submitted. One solution would be to use the `self._shutdown_lock` from the executor to protect the call to `close` in `terminate_broken` and the call to `self._thread_wakeup.wakeup` in `shutdown`. That way, the lock is only acquired at critical points without being used all the time. This could also be done by adding `lock=True/False` to only lock the potentially dangerous calls. -- nosy: +tomMoral ___ Python tracker <https://bugs.python.org/issue39995> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39678] RFC improve readability of _queue_management_worker for ProcessPoolExecutor
Change by Thomas Moreau : -- keywords: +patch pull_requests: +17931 stage: -> patch review pull_request: https://github.com/python/cpython/pull/18551 ___ Python tracker <https://bugs.python.org/issue39678> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39678] RFC improve readability of _queue_management_worker for ProcessPoolExecutor
New submission from Thomas Moreau : As discussed in GH#17670, the the `_queue_management_worker` function has grown quite long and complicated. It could be turned into an object with a bunch of short and readable helper methods. -- components: Library (Lib) messages: 362218 nosy: pitrou, tomMoral priority: normal severity: normal status: open title: RFC improve readability of _queue_management_worker for ProcessPoolExecutor versions: Python 3.9 ___ Python tracker <https://bugs.python.org/issue39678> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39104] ProcessPoolExecutor hangs on shutdown nowait with pickling failure
Change by Thomas Moreau : -- keywords: +patch pull_requests: +17134 stage: -> patch review pull_request: https://github.com/python/cpython/pull/17670 ___ Python tracker <https://bugs.python.org/issue39104> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39104] ProcessPoolExecutor hangs on shutdown nowait with pickling failure
New submission from Thomas Moreau : The attached scripts hangs on python3.7+. This is due to the fact that the main process closes the communication channels directly while the queue_management_thread might still use them. To prevent that, all the closing should be handled by the queue_management_thread. -- components: Library (Lib) files: main.py messages: 358697 nosy: tomMoral priority: normal severity: normal status: open title: ProcessPoolExecutor hangs on shutdown nowait with pickling failure versions: Python 3.7, Python 3.8, Python 3.9 Added file: https://bugs.python.org/file48794/main.py ___ Python tracker <https://bugs.python.org/issue39104> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30006] Deadlocks in `concurrent.futures.ProcessPoolExecutor`
Thomas Moreau added the comment: The deadlocks I refer to in this issue are fixed by the PR #3895. Subsequent failures (like the fact that the Executor is set in a broken state when there is an unpickling error) are tracked in other issues so I think it is safe to close this one. -- ___ Python tracker <https://bugs.python.org/issue30006> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36888] Create a way to check that the parent process is alive for deamonized processes
Change by Thomas Moreau : -- keywords: +patch pull_requests: +13158 stage: -> patch review ___ Python tracker <https://bugs.python.org/issue36888> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36888] Create a way to check that the parent process is alive for deamonized processes
New submission from Thomas Moreau : In the std lib, the semaphore_tracker and the Manager rely on daemonized processes that are launched with server like loops. The cleaning of such processes is made complicated by the fact that there is no cannonical way to check that the parent process is alive. I propose to add in context a parent_process function that would give access to a Process object representing the parent process. This way, we could benefit from sentinel to improve the clean up of this process that can be left dangling in case of hard stop from the main interpreter. -- components: Library (Lib) messages: 342199 nosy: tomMoral priority: normal severity: normal status: open title: Create a way to check that the parent process is alive for deamonized processes type: enhancement versions: Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue36888> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36668] semaphore_tracker is not reused by child processes
Change by Thomas Moreau : -- keywords: +patch pull_requests: +12805 stage: -> patch review ___ Python tracker <https://bugs.python.org/issue36668> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36668] semaphore_tracker is not reused by child processes
New submission from Thomas Moreau : The current implementation of the semaphore_tracker creates a new process for each children. The easy fix would be to pass the _pid to the children but the current mechanism to check if the semaphore_tracker is alive relies on waitpid which cannot be used in child processes (the semaphore_tracker is only a sibling of these processes). The main issue is to have a reliable check that either: The pipe is open. This is what is done here by sending a message. I don't know if there is a more efficient way to check it. Check that a given pid is alive. As we cannot rely on waitpid, I don't see an efficient mechanism. I propose to add a PROBE command in the semaphore tracker. When the pipe is closed, the send command will fail, meaning that the semaphore tracker is down. -- components: Library (Lib) messages: 340543 nosy: tomMoral priority: normal severity: normal status: open title: semaphore_tracker is not reused by child processes type: behavior versions: Python 3.8 ___ Python tracker <https://bugs.python.org/issue36668> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33945] concurrent.futures ProcessPoolExecutor submit() blocks on results being written
Thomas Moreau added the comment: This behavior results from the fact that in 3.6, the result_queue is used to pass messages to the queue_manager_thread. This behavior has been changed in 3.7 as we rely on a _ThreadWakeup object. In 3.6, when the result_queue is filled with many large objects, the call to result_queue.put(None) will hang while the previous objects are being handled by the queue_manager_thread, causing a latency in the submit. -- ___ Python tracker <https://bugs.python.org/issue33945> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33078] Queue with maxsize can lead to deadlocks
Change by Thomas Moreau : -- pull_requests: +5929 stage: needs patch -> patch review ___ Python tracker <https://bugs.python.org/issue33078> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33078] Queue with maxsize can lead to deadlocks
Change by Thomas Moreau : -- keywords: +patch pull_requests: +5883 stage: -> patch review ___ Python tracker <https://bugs.python.org/issue33078> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33078] Queue with maxsize can lead to deadlocks
New submission from Thomas Moreau : The fix for the Queue._feeder does not properly handle the size of the Queue. This can lead to a situation where the Queue is considered as Full when it is empty. Here is a reproducing script: ``` import multiprocessing as mp q = mp.Queue(1) class FailPickle(): def __reduce__(self): raise ValueError() q.put(FailPickle()) print("Queue is full:", q.full()) q.put(0) print(f"Got result: {q.get()}") ``` -- components: Library (Lib) messages: 313855 nosy: davin, pitrou, tomMoral priority: normal severity: normal status: open title: Queue with maxsize can lead to deadlocks type: behavior versions: Python 2.7, Python 3.5, Python 3.6, Python 3.7, Python 3.8 ___ Python tracker <https://bugs.python.org/issue33078> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33056] LEaking files in concurrent.futures.process
New submission from Thomas Moreau : The recent changes introduced by https://github.com/python/cpython/pull/3895 leaks some file descriptors (the Pipe open in _ThreadWakeup). They should be properly closed at shutdown. -- components: Library (Lib) messages: 313656 nosy: tomMoral priority: normal pull_requests: 5845 severity: normal status: open title: LEaking files in concurrent.futures.process type: behavior versions: Python 3.7 ___ Python tracker <https://bugs.python.org/issue33056> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31310] semaphore tracker isn't protected against crashes
Thomas Moreau added the comment: > Is it an optimization problem, or does it risk leaking semaphores? I do not think it risks leaking semaphores as the clean-up is performed by the process which created the Semaphore. So I would say it is more an optimization issue. It is true that I do not see many use case where the Semaphore might be created by the child process but it was optimized in previous version. If you don't think it is useful, we could avoid sharing the semaphore_tracker pipe to the child process to reduce complexity in the spawning process. Do you think it is worth fixing or it should be simplified? -- ___ Python tracker <https://bugs.python.org/issue31310> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31310] semaphore tracker isn't protected against crashes
Thomas Moreau added the comment: For new processes created with spawn or forkserver, only the semaphore_tracker._fd is send and shared to the child. Thus, as the _pid argument is None in the new process, it launches a new tracker if it needs to create a new Semaphore, regardless of crashes in the previous semaphore tracker. -- ___ Python tracker <https://bugs.python.org/issue31310> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31310] semaphore tracker isn't protected against crashes
Thomas Moreau added the comment: With this fix, the semaphore_tracker is not shared with the children anymore and each process launches its own tracker. I opened a PR to try to fix it. Let me know if I should open a new ticket. -- nosy: +tomMoral ___ Python tracker <https://bugs.python.org/issue31310> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31310] semaphore tracker isn't protected against crashes
Change by Thomas Moreau : -- pull_requests: +5025 ___ Python tracker <https://bugs.python.org/issue31310> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31991] Race condition in wait with timeout for multiprocessing.Event
New submission from Thomas Moreau : If the methods `set` and `clear` of `multiprocessing.Event` are called one after another, while a `multiprocessing.Process` calls `wait`, the `Event` does not match the documented behavior (https://docs.python.org/3.7/library/threading.html#threading.Event.wait) and the call to wait can return `False` even though the call to wait did not timeout (This happens both with `timeout=30` or `timeout=None`). Attached is a script reproducing this issue. The documentation should either be changed or the behavior should be updated. -- components: Library (Lib) files: concurrent_event.py messages: 305960 nosy: tomMoral priority: normal severity: normal status: open title: Race condition in wait with timeout for multiprocessing.Event type: behavior versions: Python 3.6 Added file: https://bugs.python.org/file47256/concurrent_event.py ___ Python tracker <https://bugs.python.org/issue31991> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31699] Deadlocks in `concurrent.futures.ProcessPoolExecutor` with pickling error
Change by Thomas Moreau : -- keywords: +patch pull_requests: +4218 stage: -> patch review ___ Python tracker <https://bugs.python.org/issue31699> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22281] ProcessPoolExecutor/ThreadPoolExecutor should provide introspection APIs
Change by Thomas Moreau : -- pull_requests: +4207 stage: needs patch -> patch review ___ Python tracker <https://bugs.python.org/issue22281> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31699] Deadlocks in `concurrent.futures.ProcessPoolExecutor` with pickling error
New submission from Thomas Moreau : When using `concurrent.futures.ProcessPoolExecutor` with objects that are not picklable or unpicklable, several situations results in a deadlock, with the interpreter freezed. This is the case for different scenario, for instance these three : https://gist.github.com/tomMoral/cc27a938d669edcf0286c57516942369 The different pickling/unpickling error and their effect should be tested in `test_concurrent_futures.py` -- components: Library (Lib) messages: 303745 nosy: tomMoral priority: normal pull_requests: 3866 severity: normal status: open title: Deadlocks in `concurrent.futures.ProcessPoolExecutor` with pickling error versions: Python 3.7 ___ Python tracker <https://bugs.python.org/issue31699> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31540] Adding context in concurrent.futures.ProcessPoolExecutor
New submission from Thomas Moreau: The `ProcessPoolExecutor` processes start method can only be change by changing the global default context with `set_start_method` at the beginning of a script. We propose to allow passing a context argument in the constructor to allow more flexible control of the executor. Adding this would allow testing `ProcessPoolExecutor` with all the available context -- components: Library (Lib) messages: 302674 nosy: tomMoral priority: normal pull_requests: 3671 severity: normal status: open title: Adding context in concurrent.futures.ProcessPoolExecutor type: enhancement versions: Python 3.7 ___ Python tracker <https://bugs.python.org/issue31540> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30414] multiprocesing.Queue silently ignore messages after exc in _feeder
Thomas Moreau added the comment: I think this is a good solution as it let the user define easily the behavior it needs in other situation too. I would recommend adding the object responsible for the failure to the _on_queue_thread_error callback. This would simplify the error handling. @@ -260,8 +260,16 @@ class Queue(object): info('error in queue thread: %s', e) return else: -import traceback -traceback.print_exc() +self._on_queue_thread_error(e, obj) + +def _on_queue_thread_error(self, e, obj): +""" +Private API called when feeding data in the background thread +raises an exception. For overriding by concurrent.futures. +""" +import traceback +traceback.print_exc() + -- ___ Python tracker <http://bugs.python.org/issue30414> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30414] multiprocesing.Queue silently ignore messages after exc in _feeder
Changes by Thomas Moreau : -- pull_requests: +2001 ___ Python tracker <http://bugs.python.org/issue30414> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30414] multiprocesing.Queue silently ignore messages after exc in _feeder
Thomas Moreau added the comment: This fix, while preventing the Queue to crash, does not give any way to programatically detect that the message was dropped. This is a problem as we can no longer assume that the Queue will not drop messages. For instance, we can no longer detect deadlocks in concurrent.futures.ProcessPoolExecutor as done in https://github.com/python/cpython/pull/1013 where the crashed QueueFeederThread was used to monitor the working state of the executor. We could either: - Put a flag highlighting the fact that some messages where dropped. - Add an argument to the Queue to close on pickling errors. I'd be happy to work on a PR to implement any solution that you think is reasonable. -- nosy: +tomMoral ___ Python tracker <http://bugs.python.org/issue30414> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30006] Deadlocks in `concurrent.futures.ProcessPoolExecutor`
New submission from Thomas Moreau: The design of ProcessPoolExecutor contains some possible race conditions that may freeze the interpreter due to deadlocks. This is notably the case with pickling and unpickling errors for a submitted job and returned results. This makes it hard to reuse a launched executor. We propose in the joint PR to fix some of those situations to make the ProcessPoolExecutor more robust to failure in the different threads and worker. -- components: Library (Lib) messages: 291224 nosy: tomMoral priority: normal pull_requests: 1180 severity: normal status: open title: Deadlocks in `concurrent.futures.ProcessPoolExecutor` type: behavior ___ Python tracker <http://bugs.python.org/issue30006> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com