Benjamin Mahler created MESOS-10123: ---------------------------------------
Summary: Windows overlapped IO discard handling can drop data. Key: MESOS-10123 URL: https://issues.apache.org/jira/browse/MESOS-10123 Project: Mesos Issue Type: Bug Components: libprocess Reporter: Benjamin Mahler Assignee: Benjamin Mahler When getting a discard request for an io operation on windows, a cancellation is requested [1] and when the io operation completes we check whether the future had a discard request to decide whether to discard it [2]: {code} template <typename T> static void set_io_promise(Promise<T>* promise, const T& data, DWORD error) { if (promise->future().hasDiscard()) { promise->discard(); } else if (error == ERROR_SUCCESS) { promise->set(data); } else { promise->fail("IO failed with error code: " + WindowsError(error).message); } } {code} However, it's possible the operation completed successfully, in which case we did not succeed at canceling it. We need to check for {{ERROR_OPERATION_ABORTED}} [3]: {code} template <typename T> static void set_io_promise(Promise<T>* promise, const T& data, DWORD error) { if (promise->future().hasDiscard() && error == ERROR_OPERATION_ABORTED) { promise->discard(); } else if (error == ERROR_SUCCESS) { promise->set(data); } else { promise->fail("IO failed with error code: " + WindowsError(error).message); } } {code} I don't think there are currently any major consequences to this issue, since most callers tend to be discarding only when they're essentially abandoning the entire process of reading or writing. [1] https://github.com/apache/mesos/blob/1.9.0/3rdparty/libprocess/src/windows/libwinio.cpp#L448 [2] https://github.com/apache/mesos/blob/1.9.0/3rdparty/libprocess/src/windows/libwinio.cpp#L141-L151 [3] https://docs.microsoft.com/en-us/windows/win32/fileio/cancelioex-func -- This message was sent by Atlassian Jira (v8.3.4#803005)