Thanks for your work Victor! I occasionally get questions about asyncio's stability on Windows. I am now much more confident.
On Mon, Jan 26, 2015 at 3:38 PM, Victor Stinner <victor.stin...@gmail.com> wrote: > Hi, > > I spent last weeks on fixing issues specific to the Windows > ProactorEventLoop. Even if the code "was working" in most cases, > sometimes, I noticed strange warnings, bugs or crashs. Good news: all > known issues are now fixed, and the test suite now pass and is stable! > > Please test ProactorEventLoop as much as possible! My changes are > merged in the development versions of Tulip, Trollius, Python 3.4 and > Python 3.5. > > I added new tests. It should reduce the risk of regression. > > By the way, ProactorEventLoop now supports SSL on Python 3.5 and newer! > > -- > > I'm writing this email to try to keep a trace of the changes that I > made to fix all these issues. > > Major changes: > > (1) IocpProactor.connect_pipe() was implemented using a thread which > could not be interrupted. There were hacks in IocpProactor to > workaround issues related to this. I rewrote the code using an > explicit polling with an increasing delay between 1 ms and 100 ms. > > (2) I fixed IocpProactor.accept_pipe(). The function now uses the > result of ConnectNamedPipe() to decide if we should register the > overlapped operation to wait for its completion, or it is already > done. I made a simiar change for IocpProactor.recv() (ReadFile() now > raises an exception on broken pipe error). > > (3) I fixed the cancellation of the IocpProactor.wait_for_handle() future. > > -- > > I spent most of my time to try to fix the latest issue, the > cancellation of wait_for_handle(). This issue was annoying because it > emited unexpected completion. For example, a process was seen a > terminated, while it was still running. It also emited sometimes > "unexpected event" warnings. Sometimes, it simply crashed because > Windows tried to write in a memory block which was release. I told > you, a lot of fun. > > The internal machinery of the Windows RegisterWaitForSingleObject() > function is very complex. > > Basically, RegisterWaitForSingleObject() is implemented with a > blocking call which is called in a thread. The annoying point is that > UnregisterWait() doesn't cancel immediatly the wait: it only > "schedules" the cancellation. This point is not clear in the > documentation, it took me hours to understand that. Ok, now it becomes > funnier. > > UnregisterWaitEx() exists to be notified when the wait is cancelled: > an event will be set. Ok, but how can we wait for this notification > using an IOCP? Using RegisterWaitForSingleObject() again! > > What? To cancel a first RegisterWaitForSingleObject(), we have to call > RegisterWaitForSingleObject() again on a new event? How can we cancel > the second wait? ... To protect my head against an obvious explosion, > I decided to deny the cancellation of the second kind of wait :-) > > Someone may find a more efficient way to wait for the cancellation of > the first wait. I don't know enough all Windows internals. > > Maybe we should reimplement RegisterWaitForSingleObject() in Python to > have a better control on threads and objects? I don't know yet if it > would make sense to reimplement it. > > -- > > More details! RegisterWaitForSingleObject() is implemented as a pool > of threads (500 max. by default). Each thread calls the blocking > WaitForMultipleObjects() function, which can only wait for 64 objects. > To be able to interact with these threads, each thread uses a timer > (so each thread can only wait for 63 objects). It computes the next > timeout of all registered wait operations. To modify the list of wait > operations (RegisterWait..., UnregisterWait...), the timer is reset to > wake up WaitForMultipleObjects(), and so wake up the thread. > > Since we are talking of threads, and even a pool of threads, all > operations are asynchronous. RegisterWaitForSingleObject() may spawn a > new thread, and UnregisterWait[Ex]() may stop a thread (which has > nothing to do). > > FYI it's also possible to use UnregisterWaitEx() in blocking mode. > It's not interesting in the context of asyncio. > > -- > > Full list of recent IOCP issues in Tulip and Python bug trackers. > There are now all closed. > > "_WaitHandleFuture.cancel() crash if the wait event was already > unregistered" > https://code.google.com/p/tulip/issues/detail?id=195 > > "_OverlappedFuture.set_result() should clear the its reference to the > overlapped object" > https://code.google.com/p/tulip/issues/detail?id=196 > > "Rewrite IocpProactor.connect_pipe() with non-blocking calls to avoid > non interruptible QueueUserWorkItem()" > https://code.google.com/p/tulip/issues/detail?id=197 > > "Investigate IocpProactor.accept_pipe() special case (don't register > overlapped)" > https://code.google.com/p/tulip/issues/detail?id=204 > > "race condition when cancelling a _WaitHandleFuture" > http://bugs.python.org/issue23095 > > "race condition related to IocpProactor.connect_pipe()" > http://bugs.python.org/issue23293 > > Victor > -- --Guido van Rossum (python.org/~guido)