[
https://issues.apache.org/jira/browse/THRIFT-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Christopher Friedt updated THRIFT-5677:
---------------------------------------
Description:
It would appear that there is a possible file descriptor leak in
TServerSocket.cpp
This was discovered in the final stages of porting Thrift to the Zephyr RTOS.
In the context of integration testing with the ThriftTest.thrift sample, and
specifically for the TSSLServerSocket, I was able to isolate the file
descriptor leak to TServerSocket::close() with some basic printk debugging.
Specifically, pChildInterruptSockReader_.reset() is called without first
closing a possible underlying file descriptor.
As an IoT / microcontroller RTOS, Zephyr typically runs on devices with very
little in the way of resources (RAM, ROM, CPU frequency, etc). Like everything
else, Zephyr's POSIX subsystem has a limit on the maximum number of open file
descriptors.
On large-scale operating systems, leaked file descriptors may be garbage
collected after some time (certainly when a process dies). However, on RTOSes,
that may not be the case.
It's possible that there was some assumption made about socketpair() and
whether closing one side of the socketpair closes the other. It does not, and
there is no stipulation anywhere in the POSIX standard that says that is the
case. To validate that, I wrote a small test for POSIX systems (attached) and
verified that it supports the aforementioned on both Linux and macOS. As the
POSIX subsystem maintainer for Zephyr and author of Zephyr's socketpair()
implementation, I am intimately familiar with the nuances of socketpair() and
related POSIX API.
It would appear that this regression was introduced in commit
1684c429501e9df9387cb518e660691f032d7926 in 2015.
*THIS SHOULD PROBABLY BE PATCHED ASAP AS THRIFT IS TO BE INCLUDED IN ZEPHYRS
V3.3 RELEASE*
I'll make a PR on GitHub and link back to this issue.
CC [~jensg]
was:
It would appear that there is a possible file descriptor leak in
TServerSocket.cpp
This was discovered in the final stages of porting Thrift to the Zephyr RTOS.
In the context of integration testing with the ThriftTest.thrift sample, and
specifically for the TSSLServerSocket, I was able to isolate the file
descriptor leak to TServerSocket::close() with some basic printk debugging.
Specifically, pChildInterruptSockReader_.reset() is called without first
closing a possible underlying file descriptor.
As an IoT / microcontroller RTOS, Zephyr typically runs on devices with very
little in the way of resources (RAM, ROM, CPU frequency, etc). Like everything
else, Zephyr's POSIX subsystem has a limit on the maximum number of open file
descriptors.
On large-scale operating systems, leaked file descriptors may be garbage
collected after some time (certainly when a process dies). However, on RTOSes,
that may not be the case.
It's possible that there was some assumption made about socketpair() and
whether closing one side of the socketpair closes the other. It does not, and
there is no stipulation anywhere in the POSIX standard that says that is the
case. To validate that, I wrote a small test for POSIX systems (attached) and
verified that it supports the aforementioned on both Linux and macOS. As the
POSIX subsystem maintainer for Zephyr and author of Zephyr's socketpair()
implementation, I am intimately familiar with the nuances of socketpair() and
related POSIX API.
It would appear that this regression was introduced in commit
1684c429501e9df9387cb518e660691f032d7926 in 2015.
*THIS SHOULD PROBABLY BE PATCHED ASAP AS THRIFT IS TO BE INCLUDED IN ZEPHYRS
V3.3 RELEASE*
CC [~jensg]
> lib: cpp: transport: file descriptor leak in TServerSocket::close()
> -------------------------------------------------------------------
>
> Key: THRIFT-5677
> URL: https://issues.apache.org/jira/browse/THRIFT-5677
> Project: Thrift
> Issue Type: Bug
> Components: C++ - Library
> Affects Versions: 0.12.0, 0.13.0, 0.14.0, 0.15.0, 0.14.1, 0.14.2, 0.16.0,
> 0.17.0
> Environment: See description above.
> Reporter: Christopher Friedt
> Priority: Blocker
> Labels: file-descriptor-leak, regression
> Attachments: socketpair_close_fcntl_test.c,
> thrift-zephyr-tsocketserver-close-fd-fixed.txt,
> thrift-zephyr-tsocketserver-close-fd-leak.patch,
> thrift-zephyr-tsocketserver-close-fd-leak.txt
>
>
> It would appear that there is a possible file descriptor leak in
> TServerSocket.cpp
> This was discovered in the final stages of porting Thrift to the Zephyr RTOS.
> In the context of integration testing with the ThriftTest.thrift sample, and
> specifically for the TSSLServerSocket, I was able to isolate the file
> descriptor leak to TServerSocket::close() with some basic printk debugging.
> Specifically, pChildInterruptSockReader_.reset() is called without first
> closing a possible underlying file descriptor.
> As an IoT / microcontroller RTOS, Zephyr typically runs on devices with very
> little in the way of resources (RAM, ROM, CPU frequency, etc). Like
> everything else, Zephyr's POSIX subsystem has a limit on the maximum number
> of open file descriptors.
> On large-scale operating systems, leaked file descriptors may be garbage
> collected after some time (certainly when a process dies). However, on
> RTOSes, that may not be the case.
> It's possible that there was some assumption made about socketpair() and
> whether closing one side of the socketpair closes the other. It does not, and
> there is no stipulation anywhere in the POSIX standard that says that is the
> case. To validate that, I wrote a small test for POSIX systems (attached) and
> verified that it supports the aforementioned on both Linux and macOS. As the
> POSIX subsystem maintainer for Zephyr and author of Zephyr's socketpair()
> implementation, I am intimately familiar with the nuances of socketpair() and
> related POSIX API.
> It would appear that this regression was introduced in commit
> 1684c429501e9df9387cb518e660691f032d7926 in 2015.
> *THIS SHOULD PROBABLY BE PATCHED ASAP AS THRIFT IS TO BE INCLUDED IN ZEPHYRS
> V3.3 RELEASE*
> I'll make a PR on GitHub and link back to this issue.
> CC [~jensg]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)