RE: Proposal: Asynchronous IO on Windows

2018-04-12 Thread Akash Gupta (EOSG)
Hi Ben,

Thanks for your comments!

To answer your question, we can use SetThreadpoolThreadMinimum(pool, 1) and 
SetThreadpoolThreadMaximum(pool, 1) to have a persistent single threaded 
"windows event loop"  that handles the IO and timer callbacks. This way, we 
avoid the thread pool autoscaling beyond the number of CPUs, so that the 
libprocess worker threads don't get starved. We won't be using the work queue 
aspect of the thread pool.

Thanks,
Akash

-Original Message-
From: Benjamin Mahler <bmah...@apache.org> 
Sent: Thursday, April 12, 2018 6:30 PM
To: dev <dev@mesos.apache.org>
Subject: Re: Proposal: Asynchronous IO on Windows

Thanks for writing this up and exploring the different options Akash!

I left some comments in the doc. It seems to me the windows thread pool API is 
a mix of "event" processing (timers, i/o), as well a work queue. Since 
libprocess already provides a work queue via `Process`es, there's some overlap 
there. I assume that using the event processing subset of the windows thread 
pool API along with just 1 fixed thread is essentially equivalent to having a 
"windows event loop"? We won't be using the work queue aspect of the windows 
thread pool, right?

On Thu, Apr 12, 2018 at 11:58 AM, Akash Gupta (EOSG) < 
aka...@microsoft.com.invalid> wrote:

> Hi all,
>
> A few weeks ago, we found serious issues with the current asynchronous 
> IO implementation on Windows. The two eventing libraries in Mesos 
> (libevent and libev) use `select` on Windows, which is socket-only on 
> Windows.  In fact, both of these libraries typedef their socket type 
> as SOCKET, so passing in an arbitrary file handle should not even 
> compile. Essentially, they aren't suitable for general purpose asynchronous 
> IO on Windows.
>
> This bug wasn't found earlier due to a number of reasons. Mesos has a 
> `WindowsFD` class that multiplexes and demultiplexes the different 
> Windows file types (HANDLE & SOCKET) into a singular type in order to 
> work similar to UNIX platforms that use `int` for any type of file 
> descriptor. Since WindowsFD is castable to a SOCKET, there were no 
> compile errors for using HANDLES in libevent. Furthermore, none of the 
> Windows HANDLEs were opened in asynchronous mode, so they were always 
> blocking. This means that currently, any non-socket IO in Mesos blocks 
> on Windows, so we never got runtime errors for sending arbitrary handles to 
> libevent's event loop.
> Also, some of the unit tests that would catch this blocking behavior 
> like in io_tests.cpp were disabled, so it was never caught in the unit tests.
>
> We wrote up a proposal on implementing asynchronous IO on Windows. The 
> proposal is split into two parts that focus on stout and libprocess 
> changes. The stout changes focus on opening and using asynchronous 
> handles in the stout IO implementations. The libprocess changes focus 
> on replacing libevent with another eventing library. We propose using 
> the Windows Threadpool library, which is a native Win32 API that works 
> like an event loop by allowing the user to schedule asynchronous 
> events. Both Mesos and Windows uses the proactor IO pattern, so they 
> map very cleanly. We prefer it over other asynchronous libraries like 
> libuv and ASIO, since they have some issues mentioned in the design 
> proposal like missing some features due to supporting older Windows 
> versions. However, we understand the maintenance burden of adding 
> another library, so we're looking for feedback on the design proposal.
>
> Link to JIRA issue: 
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissue
> s.apache.org%2Fjira%2Fbrowse%2FMESOS-8668=02%7C01%7Cakagup%40micr
> osoft.com%7Caebfd61a8d37435a680508d5a0de31dc%7C72f988bf86f141af91ab2d7
> cd011db47%7C1%7C0%7C636591798544292465=HgqYvUGpZPw8qKq1n8USO2iF%
> 2BRapZsSAlF4h6aLXM5g%3D=0
>
> Link to design doc:  
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.
> google.com%2Fdocument%2Fd%2F1VG_=02%7C01%7Cakagup%40microsoft.com
> %7Caebfd61a8d37435a680508d5a0de31dc%7C72f988bf86f141af91ab2d7cd011db47
> %7C1%7C0%7C636591798544292465=%2FOO94TPXBQ4FeO54T7Wap%2BorLDS6E4
> Hw2vG2vL4ahLg%3D=0 
> 8FTpWHiC7pKPoH4e-Yp7IFvAm-2wcFuk63lYByqo/edit?usp=sharing
>
> Thanks,
> Akash
>


Re: Proposal: Asynchronous IO on Windows

2018-04-12 Thread Benjamin Mahler
Thanks for writing this up and exploring the different options Akash!

I left some comments in the doc. It seems to me the windows thread pool API
is a mix of "event" processing (timers, i/o), as well a work queue. Since
libprocess already provides a work queue via `Process`es, there's some
overlap there. I assume that using the event processing subset of the
windows thread pool API along with just 1 fixed thread is essentially
equivalent to having a "windows event loop"? We won't be using the work
queue aspect of the windows thread pool, right?

On Thu, Apr 12, 2018 at 11:58 AM, Akash Gupta (EOSG) <
aka...@microsoft.com.invalid> wrote:

> Hi all,
>
> A few weeks ago, we found serious issues with the current asynchronous IO
> implementation on Windows. The two eventing libraries in Mesos (libevent
> and libev) use `select` on Windows, which is socket-only on Windows.  In
> fact, both of these libraries typedef their socket type as SOCKET, so
> passing in an arbitrary file handle should not even compile. Essentially,
> they aren't suitable for general purpose asynchronous IO on Windows.
>
> This bug wasn't found earlier due to a number of reasons. Mesos has a
> `WindowsFD` class that multiplexes and demultiplexes the different Windows
> file types (HANDLE & SOCKET) into a singular type in order to work similar
> to UNIX platforms that use `int` for any type of file descriptor. Since
> WindowsFD is castable to a SOCKET, there were no compile errors for using
> HANDLES in libevent. Furthermore, none of the Windows HANDLEs were opened
> in asynchronous mode, so they were always blocking. This means that
> currently, any non-socket IO in Mesos blocks on Windows, so we never got
> runtime errors for sending arbitrary handles to libevent's event loop.
> Also, some of the unit tests that would catch this blocking behavior like
> in io_tests.cpp were disabled, so it was never caught in the unit tests.
>
> We wrote up a proposal on implementing asynchronous IO on Windows. The
> proposal is split into two parts that focus on stout and libprocess
> changes. The stout changes focus on opening and using asynchronous handles
> in the stout IO implementations. The libprocess changes focus on replacing
> libevent with another eventing library. We propose using the Windows
> Threadpool library, which is a native Win32 API that works like an event
> loop by allowing the user to schedule asynchronous events. Both Mesos and
> Windows uses the proactor IO pattern, so they map very cleanly. We prefer
> it over other asynchronous libraries like libuv and ASIO, since they have
> some issues mentioned in the design proposal like missing some features due
> to supporting older Windows versions. However, we understand the
> maintenance burden of adding another library, so we're looking for feedback
> on the design proposal.
>
> Link to JIRA issue: https://issues.apache.org/jira/browse/MESOS-8668
>
> Link to design doc:  https://docs.google.com/document/d/1VG_
> 8FTpWHiC7pKPoH4e-Yp7IFvAm-2wcFuk63lYByqo/edit?usp=sharing
>
> Thanks,
> Akash
>