Alexey Serbin created KUDU-3645:
-----------------------------------
Summary: Do not retry close() syscall on EINTR
Key: KUDU-3645
URL: https://issues.apache.org/jira/browse/KUDU-3645
Project: Kudu
Issue Type: Improvement
Components: client, master, subprocess, test, tserver
Reporter: Alexey Serbin
It's not a good idea to retry calling close() on EINTR, at least on Linux and
other Unices except for HP-UX. In case of a multiple running threads and a lot
of concurrency with opening/closing files, sockets, etc. it might lead to
unexpected closure of descriptors that aren't supposed to be closed yet.
Similar issue has been addressed in the Boost library
[1|https://github.com/boostorg/beast/issues/1445][2|https://github.com/boostorg/beast/commit/0ce8ebbef].
It's a well known issue and it's been discussed a long time ago
[3|https://lwn.net/Articles/576478/]. Also, it's well documented in the manual
page [4|https://man7.org/linux/man-pages/man2/close.2.html]:
{noformat}
Dealing with error returns from close()
...
The EINTR error is a somewhat special case. Regarding the EINTR
error, POSIX.1-2008 says:
If close() is interrupted by a signal that is to be caught,
it shall return -1 with errno set to EINTR and the state of
fildes is unspecified.
This permits the behavior that occurs on Linux and many other
implementations, where, as with other errors that may be reported
by close(), the file descriptor is guaranteed to be closed.
However, it also permits another possibility: that the
implementation returns an EINTR error and keeps the file
descriptor open. (According to its documentation, HP-UX's close()
does this.) The caller must then once more use close() to close
the file descriptor, to avoid file descriptor leaks. This
divergence in implementation behaviors provides a difficult hurdle
for portable applications, since on many implementations, close()
must not be called again after an EINTR error, and on at least
one, close() must be called again. There are plans to address
this conundrum for the next major release of the POSIX.1 standard.
{noformat}
# https://github.com/boostorg/beast/issues/1445
# https://github.com/boostorg/beast/commit/0ce8ebbef
# https://lwn.net/Articles/576478/
# https://man7.org/linux/man-pages/man2/close.2.html
--
This message was sent by Atlassian Jira
(v8.20.10#820010)