[ 
https://issues.apache.org/jira/browse/THRIFT-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christopher Friedt updated THRIFT-5677:
---------------------------------------
    Description: 
It would appear that there is a possible file descriptor leak in 
TServerSocket.cpp

This was discovered in the final stages of porting Thrift to the Zephyr RTOS.

In the context of integration testing with the ThriftTest.thrift sample, and 
specifically for the TSSLServerSocket, I was able to isolate the file 
descriptor leak to TServerSocket::close() with some basic printk debugging. 
Specifically, pChildInterruptSockReader_.reset() is called without first 
closing a possible underlying file descriptor.

As an IoT / microcontroller RTOS, Zephyr typically runs on devices with very 
little in the way of resources (RAM, ROM, CPU frequency, etc). Like everything 
else, Zephyr's POSIX subsystem has a limit on the maximum number of open file 
descriptors.

On large-scale operating systems, leaked file descriptors may be garbage 
collected after some time (certainly when a process dies). However, on RTOSes, 
that may not be the case.

It's possible that there was some assumption made about socketpair() and 
whether closing one side of the socketpair closes the other. It does not, and 
there is no stipulation anywhere in the POSIX standard that says that is the 
case. To validate that, I wrote a small test for POSIX systems (attached) and 
verified that it supports the aforementioned on both Linux and macOS. As the 
POSIX subsystem maintainer for Zephyr and author of Zephyr's socketpair() 
implementation, I am intimately familiar with the nuances of socketpair() and 
related POSIX API.

It would appear that this regression was introduced in commit 
1684c429501e9df9387cb518e660691f032d7926 in 2015.

*THIS SHOULD PROBABLY BE PATCHED ASAP AS THRIFT IS TO BE INCLUDED IN ZEPHYRS 
V3.3 RELEASE*

I'll make a PR on GitHub and link back to this issue.

CC [~jensg] 

  was:
It would appear that there is a possible file descriptor leak in 
TServerSocket.cpp

This was discovered in the final stages of porting Thrift to the Zephyr RTOS.

In the context of integration testing with the ThriftTest.thrift sample, and 
specifically for the TSSLServerSocket, I was able to isolate the file 
descriptor leak to TServerSocket::close() with some basic printk debugging. 
Specifically, pChildInterruptSockReader_.reset() is called without first 
closing a possible underlying file descriptor.

As an IoT / microcontroller RTOS, Zephyr typically runs on devices with very 
little in the way of resources (RAM, ROM, CPU frequency, etc). Like everything 
else, Zephyr's POSIX subsystem has a limit on the maximum number of open file 
descriptors.

On large-scale operating systems, leaked file descriptors may be garbage 
collected after some time (certainly when a process dies). However, on RTOSes, 
that may not be the case.

It's possible that there was some assumption made about socketpair() and 
whether closing one side of the socketpair closes the other. It does not, and 
there is no stipulation anywhere in the POSIX standard that says that is the 
case. To validate that, I wrote a small test for POSIX systems (attached) and 
verified that it supports the aforementioned on both Linux and macOS. As the 
POSIX subsystem maintainer for Zephyr and author of Zephyr's socketpair() 
implementation, I am intimately familiar with the nuances of socketpair() and 
related POSIX API.

It would appear that this regression was introduced in commit 
1684c429501e9df9387cb518e660691f032d7926 in 2015.

*THIS SHOULD PROBABLY BE PATCHED ASAP AS THRIFT IS TO BE INCLUDED IN ZEPHYRS 
V3.3 RELEASE*

CC [~jensg] 


> lib: cpp: transport: file descriptor leak in TServerSocket::close()
> -------------------------------------------------------------------
>
>                 Key: THRIFT-5677
>                 URL: https://issues.apache.org/jira/browse/THRIFT-5677
>             Project: Thrift
>          Issue Type: Bug
>          Components: C++ - Library
>    Affects Versions: 0.12.0, 0.13.0, 0.14.0, 0.15.0, 0.14.1, 0.14.2, 0.16.0, 
> 0.17.0
>         Environment: See description above.
>            Reporter: Christopher Friedt
>            Priority: Blocker
>              Labels: file-descriptor-leak, regression
>         Attachments: socketpair_close_fcntl_test.c, 
> thrift-zephyr-tsocketserver-close-fd-fixed.txt, 
> thrift-zephyr-tsocketserver-close-fd-leak.patch, 
> thrift-zephyr-tsocketserver-close-fd-leak.txt
>
>
> It would appear that there is a possible file descriptor leak in 
> TServerSocket.cpp
> This was discovered in the final stages of porting Thrift to the Zephyr RTOS.
> In the context of integration testing with the ThriftTest.thrift sample, and 
> specifically for the TSSLServerSocket, I was able to isolate the file 
> descriptor leak to TServerSocket::close() with some basic printk debugging. 
> Specifically, pChildInterruptSockReader_.reset() is called without first 
> closing a possible underlying file descriptor.
> As an IoT / microcontroller RTOS, Zephyr typically runs on devices with very 
> little in the way of resources (RAM, ROM, CPU frequency, etc). Like 
> everything else, Zephyr's POSIX subsystem has a limit on the maximum number 
> of open file descriptors.
> On large-scale operating systems, leaked file descriptors may be garbage 
> collected after some time (certainly when a process dies). However, on 
> RTOSes, that may not be the case.
> It's possible that there was some assumption made about socketpair() and 
> whether closing one side of the socketpair closes the other. It does not, and 
> there is no stipulation anywhere in the POSIX standard that says that is the 
> case. To validate that, I wrote a small test for POSIX systems (attached) and 
> verified that it supports the aforementioned on both Linux and macOS. As the 
> POSIX subsystem maintainer for Zephyr and author of Zephyr's socketpair() 
> implementation, I am intimately familiar with the nuances of socketpair() and 
> related POSIX API.
> It would appear that this regression was introduced in commit 
> 1684c429501e9df9387cb518e660691f032d7926 in 2015.
> *THIS SHOULD PROBABLY BE PATCHED ASAP AS THRIFT IS TO BE INCLUDED IN ZEPHYRS 
> V3.3 RELEASE*
> I'll make a PR on GitHub and link back to this issue.
> CC [~jensg] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to