[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-29 Thread Thomas Moreau


Thomas Moreau  added the comment:

I did GH 19788 with a few modifications. There is only one lock that seems to 
mater for the perf, and I actually added one other (the one in _python_exit, 
which necessitate another bug fix for fork context).

I did not benchmark to see if it was worth it in term of perf.

--

___
Python tracker 
<https://bugs.python.org/issue39995>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-29 Thread Thomas Moreau


Change by Thomas Moreau :


--
pull_requests: +19111
pull_request: https://github.com/python/cpython/pull/19788

___
Python tracker 
<https://bugs.python.org/issue39995>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-29 Thread Thomas Moreau


Thomas Moreau  added the comment:

I think this is a reasonable way to move on.Some of the locks can probably be 
removed but this needs careful investigation and in the mean time, it hinders 
everyone. Thanks victor for the fast fix up! 

To me, an interesting observation is that the failure seems to only happen when 
the executor is in broken state. If we can find a way to adapt the behavior to 
be more conservative on these states (which are not impacting perf) that would 
be nice.

I will try to look at it today and see if I can remove some of the locks while 
still not failing with Victor's patch. We can probably remove the lock on 
`clear` safely. For the others, it might be more complex.

--

___
Python tracker 
<https://bugs.python.org/issue39995>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-28 Thread Thomas Moreau


Thomas Moreau  added the comment:

Sorry I just saw this. It seems that I introduced this regression.

One of the goal of having a `ThreadWakeup` and not a `SimpleQueue` is to avoid 
using locks that can hinder the performance of the executor. I don't remember 
the exact details but I think I did some benchmark and it was giving lower 
performances for large set of small tasks (not so sure anymore). If a fully 
synchronous method is chosen, maybe it is safer to rely on a `SimpleQueue` as 
it will be lower maintenance. If the solution proposed by aeros is chosen, the 
event can probably be replaced by a lock no? It would be easier to understand I 
guess.

>From the failures, it seems to be a race condition between `shutdown` and 
>`terminate_broken`. The race condition should not be possible in `submit` as 
>in principle, the `join_executor_internals` is only called when no new task 
>can be submitted.
One solution would be to use the `self._shutdown_lock` from the executor to 
protect the call to `close` in `terminate_broken` and the call to 
`self._thread_wakeup.wakeup` in `shutdown`. That way, the lock is only acquired 
at critical points without being used all the time. This could also be done by 
adding `lock=True/False` to only lock the potentially dangerous calls.

--
nosy: +tomMoral

___
Python tracker 
<https://bugs.python.org/issue39995>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39678] RFC improve readability of _queue_management_worker for ProcessPoolExecutor

2020-02-18 Thread Thomas Moreau


Change by Thomas Moreau :


--
keywords: +patch
pull_requests: +17931
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/18551

___
Python tracker 
<https://bugs.python.org/issue39678>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39678] RFC improve readability of _queue_management_worker for ProcessPoolExecutor

2020-02-18 Thread Thomas Moreau


New submission from Thomas Moreau :

As discussed in GH#17670, the the `_queue_management_worker` function has grown 
quite long and complicated.
It could be turned into an object with a bunch of short and readable helper 
methods.

--
components: Library (Lib)
messages: 362218
nosy: pitrou, tomMoral
priority: normal
severity: normal
status: open
title: RFC improve readability of _queue_management_worker for 
ProcessPoolExecutor
versions: Python 3.9

___
Python tracker 
<https://bugs.python.org/issue39678>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39104] ProcessPoolExecutor hangs on shutdown nowait with pickling failure

2019-12-19 Thread Thomas Moreau


Change by Thomas Moreau :


--
keywords: +patch
pull_requests: +17134
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/17670

___
Python tracker 
<https://bugs.python.org/issue39104>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39104] ProcessPoolExecutor hangs on shutdown nowait with pickling failure

2019-12-19 Thread Thomas Moreau


New submission from Thomas Moreau :

The attached scripts hangs on python3.7+.
This is due to the fact that the main process closes the communication channels 
directly while the queue_management_thread might still use them.

To prevent that, all the closing should be handled by the 
queue_management_thread.

--
components: Library (Lib)
files: main.py
messages: 358697
nosy: tomMoral
priority: normal
severity: normal
status: open
title: ProcessPoolExecutor hangs on shutdown nowait with pickling failure
versions: Python 3.7, Python 3.8, Python 3.9
Added file: https://bugs.python.org/file48794/main.py

___
Python tracker 
<https://bugs.python.org/issue39104>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30006] Deadlocks in `concurrent.futures.ProcessPoolExecutor`

2019-05-30 Thread Thomas Moreau


Thomas Moreau  added the comment:

The deadlocks I refer to in this issue are fixed by the PR #3895.
Subsequent failures (like the fact that the Executor is set in a broken state 
when there is an unpickling error) are tracked in other issues so I think it is 
safe to close this one.

--

___
Python tracker 
<https://bugs.python.org/issue30006>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36888] Create a way to check that the parent process is alive for deamonized processes

2019-05-11 Thread Thomas Moreau


Change by Thomas Moreau :


--
keywords: +patch
pull_requests: +13158
stage:  -> patch review

___
Python tracker 
<https://bugs.python.org/issue36888>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36888] Create a way to check that the parent process is alive for deamonized processes

2019-05-11 Thread Thomas Moreau


New submission from Thomas Moreau :

In the std lib, the semaphore_tracker and the Manager rely on daemonized 
processes that are launched with server like loops. The cleaning of such 
processes is made complicated by the fact that there is no cannonical way to 
check that the parent process is alive.

I propose to add in context a parent_process function that would give access to 
a Process object representing the parent process. This way, we could benefit 
from sentinel to improve the clean up of this process that can be left dangling 
in case of hard stop from the main interpreter.

--
components: Library (Lib)
messages: 342199
nosy: tomMoral
priority: normal
severity: normal
status: open
title: Create a way to check that the parent process is alive for deamonized 
processes
type: enhancement
versions: Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue36888>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36668] semaphore_tracker is not reused by child processes

2019-04-19 Thread Thomas Moreau


Change by Thomas Moreau :


--
keywords: +patch
pull_requests: +12805
stage:  -> patch review

___
Python tracker 
<https://bugs.python.org/issue36668>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36668] semaphore_tracker is not reused by child processes

2019-04-19 Thread Thomas Moreau


New submission from Thomas Moreau :

The current implementation of the semaphore_tracker creates a new process for 
each children.

The easy fix would be to pass the _pid to the children but the current 
mechanism to check if the semaphore_tracker is alive relies on waitpid which 
cannot be used in child processes (the semaphore_tracker is only a sibling of 
these processes). The main issue is to have a reliable check that either:

The pipe is open. This is what is done here by sending a message. I don't 
know if there is a more efficient way to check it.
Check that a given pid is alive. As we cannot rely on waitpid, I don't see 
an efficient mechanism.

I propose to add a PROBE command in the semaphore tracker. When the pipe is 
closed, the send command will fail, meaning that the semaphore tracker is down.

--
components: Library (Lib)
messages: 340543
nosy: tomMoral
priority: normal
severity: normal
status: open
title: semaphore_tracker is not reused by child processes
type: behavior
versions: Python 3.8

___
Python tracker 
<https://bugs.python.org/issue36668>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33945] concurrent.futures ProcessPoolExecutor submit() blocks on results being written

2018-06-24 Thread Thomas Moreau


Thomas Moreau  added the comment:

This behavior results from the fact that in 3.6, the result_queue is used to 
pass messages to the queue_manager_thread. This behavior has been changed in 
3.7 as we rely on a _ThreadWakeup object.

In 3.6, when the result_queue is filled with many large objects, the call to 
result_queue.put(None) will hang while the previous objects are being handled 
by the queue_manager_thread, causing a latency in the submit.

--

___
Python tracker 
<https://bugs.python.org/issue33945>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33078] Queue with maxsize can lead to deadlocks

2018-03-21 Thread Thomas Moreau

Change by Thomas Moreau :


--
pull_requests: +5929
stage: needs patch -> patch review

___
Python tracker 
<https://bugs.python.org/issue33078>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33078] Queue with maxsize can lead to deadlocks

2018-03-14 Thread Thomas Moreau

Change by Thomas Moreau :


--
keywords: +patch
pull_requests: +5883
stage:  -> patch review

___
Python tracker 
<https://bugs.python.org/issue33078>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33078] Queue with maxsize can lead to deadlocks

2018-03-14 Thread Thomas Moreau

New submission from Thomas Moreau :

The fix for the Queue._feeder does not properly handle the size of the Queue. 
This can lead to a situation where the Queue is considered as Full when it is 
empty. Here is a reproducing script:

```
import multiprocessing as mp

q = mp.Queue(1)


class FailPickle():
def __reduce__(self):
raise ValueError()

q.put(FailPickle())
print("Queue is full:", q.full())
q.put(0)
print(f"Got result: {q.get()}")
```

--
components: Library (Lib)
messages: 313855
nosy: davin, pitrou, tomMoral
priority: normal
severity: normal
status: open
title: Queue with maxsize can lead to deadlocks
type: behavior
versions: Python 2.7, Python 3.5, Python 3.6, Python 3.7, Python 3.8

___
Python tracker 
<https://bugs.python.org/issue33078>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33056] LEaking files in concurrent.futures.process

2018-03-12 Thread Thomas Moreau

New submission from Thomas Moreau :

The recent changes introduced by https://github.com/python/cpython/pull/3895 
leaks some file descriptors (the Pipe open in _ThreadWakeup).
They should be properly closed at shutdown.

--
components: Library (Lib)
messages: 313656
nosy: tomMoral
priority: normal
pull_requests: 5845
severity: normal
status: open
title: LEaking files in concurrent.futures.process
type: behavior
versions: Python 3.7

___
Python tracker 
<https://bugs.python.org/issue33056>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31310] semaphore tracker isn't protected against crashes

2018-01-13 Thread Thomas Moreau

Thomas Moreau  added the comment:

> Is it an optimization problem, or does it risk leaking semaphores?

I do not think it risks leaking semaphores as the clean-up is performed by the 
process which created the Semaphore. So I would say it is more an optimization 
issue.

It is true that I do not see many use case where the Semaphore might be created 
by the child process but it was optimized in previous version. If you don't 
think it is useful, we could avoid sharing the semaphore_tracker pipe to the 
child process to reduce complexity in the spawning process. Do you think it is 
worth fixing or it should be simplified?

--

___
Python tracker 
<https://bugs.python.org/issue31310>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31310] semaphore tracker isn't protected against crashes

2018-01-13 Thread Thomas Moreau

Thomas Moreau  added the comment:

For new processes created with spawn or forkserver, only the 
semaphore_tracker._fd is send and shared to the child. Thus, as the _pid 
argument is None in the new process, it launches a new tracker if it needs to 
create a new Semaphore, regardless of crashes in the previous semaphore tracker.

--

___
Python tracker 
<https://bugs.python.org/issue31310>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31310] semaphore tracker isn't protected against crashes

2018-01-13 Thread Thomas Moreau

Thomas Moreau  added the comment:

With this fix, the semaphore_tracker is not shared with the children anymore 
and each process launches its own tracker.

I opened a PR to try to fix it. Let me know if I should open a new ticket.

--
nosy: +tomMoral

___
Python tracker 
<https://bugs.python.org/issue31310>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31310] semaphore tracker isn't protected against crashes

2018-01-13 Thread Thomas Moreau

Change by Thomas Moreau :


--
pull_requests: +5025

___
Python tracker 
<https://bugs.python.org/issue31310>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31991] Race condition in wait with timeout for multiprocessing.Event

2017-11-09 Thread Thomas Moreau

New submission from Thomas Moreau :

If the methods `set` and `clear` of `multiprocessing.Event` are called one 
after another, while a `multiprocessing.Process` calls `wait`, the  `Event` 
does not match the documented behavior 
(https://docs.python.org/3.7/library/threading.html#threading.Event.wait) and 
the call to wait can return `False` even though the call to wait did not 
timeout (This happens both with `timeout=30` or `timeout=None`).

Attached is a script reproducing this issue.
The documentation should either be changed or the behavior should be updated.

--
components: Library (Lib)
files: concurrent_event.py
messages: 305960
nosy: tomMoral
priority: normal
severity: normal
status: open
title: Race condition in wait with timeout for multiprocessing.Event
type: behavior
versions: Python 3.6
Added file: https://bugs.python.org/file47256/concurrent_event.py

___
Python tracker 
<https://bugs.python.org/issue31991>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31699] Deadlocks in `concurrent.futures.ProcessPoolExecutor` with pickling error

2017-11-03 Thread Thomas Moreau

Change by Thomas Moreau :


--
keywords: +patch
pull_requests: +4218
stage:  -> patch review

___
Python tracker 
<https://bugs.python.org/issue31699>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22281] ProcessPoolExecutor/ThreadPoolExecutor should provide introspection APIs

2017-11-02 Thread Thomas Moreau

Change by Thomas Moreau :


--
pull_requests: +4207
stage: needs patch -> patch review

___
Python tracker 
<https://bugs.python.org/issue22281>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31699] Deadlocks in `concurrent.futures.ProcessPoolExecutor` with pickling error

2017-10-05 Thread Thomas Moreau

New submission from Thomas Moreau :

When using `concurrent.futures.ProcessPoolExecutor` with objects that are not 
picklable or unpicklable, several situations results in a deadlock, with the 
interpreter freezed.

This is the case for different scenario, for instance these three : 
https://gist.github.com/tomMoral/cc27a938d669edcf0286c57516942369

The different pickling/unpickling error and their effect should be tested in 
`test_concurrent_futures.py`

--
components: Library (Lib)
messages: 303745
nosy: tomMoral
priority: normal
pull_requests: 3866
severity: normal
status: open
title: Deadlocks in `concurrent.futures.ProcessPoolExecutor` with pickling error
versions: Python 3.7

___
Python tracker 
<https://bugs.python.org/issue31699>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31540] Adding context in concurrent.futures.ProcessPoolExecutor

2017-09-21 Thread Thomas Moreau

New submission from Thomas Moreau:

The `ProcessPoolExecutor` processes start method can only be change by changing 
the global default context with `set_start_method` at the beginning of a 
script. We propose to allow passing a context argument in the constructor to 
allow more flexible control of the executor.

Adding this would allow testing `ProcessPoolExecutor` with all the available 
context

--
components: Library (Lib)
messages: 302674
nosy: tomMoral
priority: normal
pull_requests: 3671
severity: normal
status: open
title: Adding context in concurrent.futures.ProcessPoolExecutor
type: enhancement
versions: Python 3.7

___
Python tracker 
<https://bugs.python.org/issue31540>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30414] multiprocesing.Queue silently ignore messages after exc in _feeder

2017-06-04 Thread Thomas Moreau

Thomas Moreau added the comment:

I think this is a good solution as it let the user define easily the behavior 
it needs in other situation too. I would recommend adding the object 
responsible for the failure to the _on_queue_thread_error callback. This would 
simplify the error handling.


@@ -260,8 +260,16 @@ class Queue(object):
 info('error in queue thread: %s', e)
 return
 else:
-import traceback
-traceback.print_exc()
+self._on_queue_thread_error(e, obj)
+
+def _on_queue_thread_error(self, e, obj):
+"""
+Private API called when feeding data in the background thread
+raises an exception.  For overriding by concurrent.futures.
+"""
+import traceback
+traceback.print_exc()
+

--

___
Python tracker 
<http://bugs.python.org/issue30414>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30414] multiprocesing.Queue silently ignore messages after exc in _feeder

2017-06-02 Thread Thomas Moreau

Changes by Thomas Moreau :


--
pull_requests: +2001

___
Python tracker 
<http://bugs.python.org/issue30414>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30414] multiprocesing.Queue silently ignore messages after exc in _feeder

2017-06-02 Thread Thomas Moreau

Thomas Moreau added the comment:

This fix, while preventing the Queue to crash, does not give any way to  
programatically detect that the message was dropped. This is a problem as we 
can no longer assume that the Queue will not drop messages. For instance, we 
can no longer detect deadlocks in concurrent.futures.ProcessPoolExecutor as 
done in https://github.com/python/cpython/pull/1013 where the crashed 
QueueFeederThread was used to monitor the working state of the executor.

We could either:
- Put a flag highlighting the fact that some messages where dropped.
- Add an argument to the Queue to close on pickling errors.

I'd be happy to work on a PR to implement any solution that you think is 
reasonable.

--
nosy: +tomMoral

___
Python tracker 
<http://bugs.python.org/issue30414>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30006] Deadlocks in `concurrent.futures.ProcessPoolExecutor`

2017-04-06 Thread Thomas Moreau

New submission from Thomas Moreau:

The design of ProcessPoolExecutor contains some possible race conditions that 
may freeze the interpreter due to deadlocks. This is notably the case  with 
pickling and unpickling errors for a submitted job and returned results. This 
makes it hard to reuse a launched executor.

We propose in the joint PR to fix some of those situations to make the 
ProcessPoolExecutor more robust to failure in the different threads and worker.

--
components: Library (Lib)
messages: 291224
nosy: tomMoral
priority: normal
pull_requests: 1180
severity: normal
status: open
title: Deadlocks in `concurrent.futures.ProcessPoolExecutor`
type: behavior

___
Python tracker 
<http://bugs.python.org/issue30006>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com