Thanks for your work Victor! I occasionally get questions about asyncio's
stability on Windows. I am now much more confident.

On Mon, Jan 26, 2015 at 3:38 PM, Victor Stinner <victor.stin...@gmail.com>
wrote:

> Hi,
>
> I spent last weeks on fixing issues specific to the Windows
> ProactorEventLoop. Even if the code "was working" in most cases,
> sometimes, I noticed strange warnings, bugs or crashs. Good news: all
> known issues are now fixed, and the test suite now pass and is stable!
>
> Please test ProactorEventLoop as much as possible! My changes are
> merged in the development versions of Tulip, Trollius, Python 3.4 and
> Python 3.5.
>
> I added new tests. It should reduce the risk of regression.
>
> By the way, ProactorEventLoop now supports SSL on Python 3.5 and newer!
>
> --
>
> I'm writing this email to try to keep a trace of the changes that I
> made to fix all these issues.
>
> Major changes:
>
> (1) IocpProactor.connect_pipe() was implemented using a thread which
> could not be interrupted. There were hacks in IocpProactor to
> workaround issues related to this. I rewrote the code using an
> explicit polling with an increasing delay between 1 ms and 100 ms.
>
> (2) I fixed IocpProactor.accept_pipe(). The function now uses the
> result of ConnectNamedPipe() to decide if we should register the
> overlapped operation to wait for its completion, or it is already
> done. I made a simiar change for IocpProactor.recv() (ReadFile() now
> raises an exception on broken pipe error).
>
> (3) I fixed the cancellation of the IocpProactor.wait_for_handle() future.
>
> --
>
> I spent most of my time to try to fix the latest issue, the
> cancellation of wait_for_handle(). This issue was annoying because it
> emited unexpected completion. For example, a process was seen a
> terminated, while it was still running. It also emited sometimes
> "unexpected event" warnings. Sometimes, it simply crashed because
> Windows tried to write in a memory block which was release. I told
> you, a lot of fun.
>
> The internal machinery of the Windows RegisterWaitForSingleObject()
> function is very complex.
>
> Basically, RegisterWaitForSingleObject() is implemented with a
> blocking call which is called in a thread. The annoying point is that
> UnregisterWait() doesn't cancel immediatly the wait: it only
> "schedules" the cancellation. This point is not clear in the
> documentation, it took me hours to understand that. Ok, now it becomes
> funnier.
>
> UnregisterWaitEx() exists to be notified when the wait is cancelled:
> an event will be set. Ok, but how can we wait for this notification
> using an IOCP? Using RegisterWaitForSingleObject() again!
>
> What? To cancel a first RegisterWaitForSingleObject(), we have to call
> RegisterWaitForSingleObject() again on a new event? How can we cancel
> the second wait? ... To protect my head against an obvious explosion,
> I decided to deny the cancellation of the second kind of wait :-)
>
> Someone may find a more efficient way to wait for the cancellation of
> the first wait. I don't know enough all Windows internals.
>
> Maybe we should reimplement RegisterWaitForSingleObject() in Python to
> have a better control on threads and objects? I don't know yet if it
> would make sense to reimplement it.
>
> --
>
> More details! RegisterWaitForSingleObject() is implemented as a pool
> of threads (500 max. by default). Each thread calls the blocking
> WaitForMultipleObjects() function, which can only wait for 64 objects.
> To be able to interact with these threads, each thread uses a timer
> (so each thread can only wait for 63 objects). It computes the next
> timeout of all registered wait operations. To modify the list of wait
> operations (RegisterWait..., UnregisterWait...), the timer is reset to
> wake up WaitForMultipleObjects(), and so wake up the thread.
>
> Since we are talking of threads, and even a pool of threads, all
> operations are asynchronous. RegisterWaitForSingleObject() may spawn a
> new thread, and UnregisterWait[Ex]() may stop a thread (which has
> nothing to do).
>
> FYI it's also possible to use UnregisterWaitEx() in blocking mode.
> It's not interesting in the context of asyncio.
>
> --
>
> Full list of recent IOCP issues in Tulip and Python bug trackers.
> There are now all closed.
>
> "_WaitHandleFuture.cancel() crash if the wait event was already
> unregistered"
> https://code.google.com/p/tulip/issues/detail?id=195
>
> "_OverlappedFuture.set_result() should clear the its reference to the
> overlapped object"
> https://code.google.com/p/tulip/issues/detail?id=196
>
> "Rewrite IocpProactor.connect_pipe() with non-blocking calls to avoid
> non interruptible QueueUserWorkItem()"
> https://code.google.com/p/tulip/issues/detail?id=197
>
> "Investigate IocpProactor.accept_pipe() special case (don't register
> overlapped)"
> https://code.google.com/p/tulip/issues/detail?id=204
>
> "race condition when cancelling a _WaitHandleFuture"
> http://bugs.python.org/issue23095
>
> "race condition related to IocpProactor.connect_pipe()"
> http://bugs.python.org/issue23293
>
> Victor
>



-- 
--Guido van Rossum (python.org/~guido)

Reply via email to