Thanks for writing this up and exploring the different options Akash!

I left some comments in the doc. It seems to me the windows thread pool API
is a mix of "event" processing (timers, i/o), as well a work queue. Since
libprocess already provides a work queue via `Process`es, there's some
overlap there. I assume that using the event processing subset of the
windows thread pool API along with just 1 fixed thread is essentially
equivalent to having a "windows event loop"? We won't be using the work
queue aspect of the windows thread pool, right?

On Thu, Apr 12, 2018 at 11:58 AM, Akash Gupta (EOSG) <
aka...@microsoft.com.invalid> wrote:

> Hi all,
>
> A few weeks ago, we found serious issues with the current asynchronous IO
> implementation on Windows. The two eventing libraries in Mesos (libevent
> and libev) use `select` on Windows, which is socket-only on Windows.  In
> fact, both of these libraries typedef their socket type as SOCKET, so
> passing in an arbitrary file handle should not even compile. Essentially,
> they aren't suitable for general purpose asynchronous IO on Windows.
>
> This bug wasn't found earlier due to a number of reasons. Mesos has a
> `WindowsFD` class that multiplexes and demultiplexes the different Windows
> file types (HANDLE & SOCKET) into a singular type in order to work similar
> to UNIX platforms that use `int` for any type of file descriptor. Since
> WindowsFD is castable to a SOCKET, there were no compile errors for using
> HANDLES in libevent. Furthermore, none of the Windows HANDLEs were opened
> in asynchronous mode, so they were always blocking. This means that
> currently, any non-socket IO in Mesos blocks on Windows, so we never got
> runtime errors for sending arbitrary handles to libevent's event loop.
> Also, some of the unit tests that would catch this blocking behavior like
> in io_tests.cpp were disabled, so it was never caught in the unit tests.
>
> We wrote up a proposal on implementing asynchronous IO on Windows. The
> proposal is split into two parts that focus on stout and libprocess
> changes. The stout changes focus on opening and using asynchronous handles
> in the stout IO implementations. The libprocess changes focus on replacing
> libevent with another eventing library. We propose using the Windows
> Threadpool library, which is a native Win32 API that works like an event
> loop by allowing the user to schedule asynchronous events. Both Mesos and
> Windows uses the proactor IO pattern, so they map very cleanly. We prefer
> it over other asynchronous libraries like libuv and ASIO, since they have
> some issues mentioned in the design proposal like missing some features due
> to supporting older Windows versions. However, we understand the
> maintenance burden of adding another library, so we're looking for feedback
> on the design proposal.
>
> Link to JIRA issue: https://issues.apache.org/jira/browse/MESOS-8668
>
> Link to design doc:  https://docs.google.com/document/d/1VG_
> 8FTpWHiC7pKPoH4e-Yp7IFvAm-2wcFuk63lYByqo/edit?usp=sharing
>
> Thanks,
> Akash
>

Reply via email to