[jira] [Commented] (KAFKA-16765) NioEchoServer leaks accepted SocketChannel instances due to race condition
[ https://issues.apache.org/jira/browse/KAFKA-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17860996#comment-17860996 ] zhengke zhou commented on KAFKA-16765: -- [Greg Harris|https://issues.apache.org/jira/secure/ViewProfile.jspa?name=gharris1727] In this ticket, we want to avoid the situation where the Producer: acceptorThread creates a channel and adds it to SocketChannels that will never be consumed by the main thread. To resolve this, we can clear up closeSocketChannels after acceptorThread exited. > NioEchoServer leaks accepted SocketChannel instances due to race condition > -- > > Key: KAFKA-16765 > URL: https://issues.apache.org/jira/browse/KAFKA-16765 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 3.8.0 >Reporter: Greg Harris >Assignee: zhengke zhou >Priority: Minor > Labels: newbie > > The NioEchoServer has an AcceptorThread that calls accept() to open new > SocketChannel instances and insert them into the `newChannels` List, and a > main thread that drains the `newChannels` List and moves them to the > `socketChannels` List. > During shutdown, the serverSocketChannel is closed, which causes both threads > to exit their while loops. It is possible for the NioEchoServer main thread > to sense the serverSocketChannel close and terminate before the Acceptor > thread does, and for the Acceptor thread to put a SocketChannel in > `newChannels` before terminating. This instance is never closed by either > thread, because it is never moved to `socketChannels`. > A precise execution order that has this leak is: > 1. NioEchoServer thread locks `newChannels`. > 2. Acceptor thread accept() completes, and the SocketChannel is created > 3. Acceptor thread blocks waiting for the `newChannels` lock > 4. NioEchoServer thread releases the `newChannels` lock and does some > processing > 5. NioEchoServer#close() is called, which closes the serverSocketChannel > 6. NioEchoServer thread checks serverSocketChannel.isOpen() and then > terminates > 7. Acceptor thread acquires the `newChannels` lock and adds the SocketChannel > to `newChannels`. > 8. Acceptor thread checks serverSocketChannel.isOpen() and then terminates. > 9. NioEchoServer#close() stops blocking now that both other threads have > terminated. > The end result is that the leaked socket is left open in the `newChannels` > list at the end of close(), which is incorrect. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16765) NioEchoServer leaks accepted SocketChannel instances due to race condition
[ https://issues.apache.org/jira/browse/KAFKA-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17860952#comment-17860952 ] zhengke zhou commented on KAFKA-16765: -- If we can make *main* thread always close after *AcceptorThread* that ** seems like can be fixed. I will try to fix this. > NioEchoServer leaks accepted SocketChannel instances due to race condition > -- > > Key: KAFKA-16765 > URL: https://issues.apache.org/jira/browse/KAFKA-16765 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 3.8.0 >Reporter: Greg Harris >Priority: Minor > Labels: newbie > > The NioEchoServer has an AcceptorThread that calls accept() to open new > SocketChannel instances and insert them into the `newChannels` List, and a > main thread that drains the `newChannels` List and moves them to the > `socketChannels` List. > During shutdown, the serverSocketChannel is closed, which causes both threads > to exit their while loops. It is possible for the NioEchoServer main thread > to sense the serverSocketChannel close and terminate before the Acceptor > thread does, and for the Acceptor thread to put a SocketChannel in > `newChannels` before terminating. This instance is never closed by either > thread, because it is never moved to `socketChannels`. > A precise execution order that has this leak is: > 1. NioEchoServer thread locks `newChannels`. > 2. Acceptor thread accept() completes, and the SocketChannel is created > 3. Acceptor thread blocks waiting for the `newChannels` lock > 4. NioEchoServer thread releases the `newChannels` lock and does some > processing > 5. NioEchoServer#close() is called, which closes the serverSocketChannel > 6. NioEchoServer thread checks serverSocketChannel.isOpen() and then > terminates > 7. Acceptor thread acquires the `newChannels` lock and adds the SocketChannel > to `newChannels`. > 8. Acceptor thread checks serverSocketChannel.isOpen() and then terminates. > 9. NioEchoServer#close() stops blocking now that both other threads have > terminated. > The end result is that the leaked socket is left open in the `newChannels` > list at the end of close(), which is incorrect. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16765) NioEchoServer leaks accepted SocketChannel instances due to race condition
[ https://issues.apache.org/jira/browse/KAFKA-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17846475#comment-17846475 ] Greg Harris commented on KAFKA-16765: - This is also a bug in EchoServer: [https://github.com/apache/kafka/blob/cb968845ecb3cb0982182d9dd437ecf652fe38d3/clients/src/test/java/org/apache/kafka/common/network/EchoServer.java#L76-L81] and ServerShutdownTest: [https://github.com/apache/kafka/blob/cb968845ecb3cb0982182d9dd437ecf652fe38d3/core/src/test/scala/unit/kafka/server/ServerShutdownTest.scala#L274-L275] except those don't require a race condition to happen. > NioEchoServer leaks accepted SocketChannel instances due to race condition > -- > > Key: KAFKA-16765 > URL: https://issues.apache.org/jira/browse/KAFKA-16765 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 3.8.0 >Reporter: Greg Harris >Priority: Minor > > The NioEchoServer has an AcceptorThread that calls accept() to open new > SocketChannel instances and insert them into the `newChannels` List, and a > main thread that drains the `newChannels` List and moves them to the > `socketChannels` List. > During shutdown, the serverSocketChannel is closed, which causes both threads > to exit their while loops. It is possible for the NioEchoServer main thread > to sense the serverSocketChannel close and terminate before the Acceptor > thread does, and for the Acceptor thread to put a SocketChannel in > `newChannels` before terminating. This instance is never closed by either > thread, because it is never moved to `socketChannels`. > A precise execution order that has this leak is: > 1. NioEchoServer thread locks `newChannels`. > 2. Acceptor thread accept() completes, and the SocketChannel is created > 3. Acceptor thread blocks waiting for the `newChannels` lock > 4. NioEchoServer thread releases the `newChannels` lock and does some > processing > 5. NioEchoServer#close() is called, which closes the serverSocketChannel > 6. NioEchoServer thread checks serverSocketChannel.isOpen() and then > terminates > 7. Acceptor thread acquires the `newChannels` lock and adds the SocketChannel > to `newChannels`. > 8. Acceptor thread checks serverSocketChannel.isOpen() and then terminates. > 9. NioEchoServer#close() stops blocking now that both other threads have > terminated. > The end result is that the leaked socket is left open in the `newChannels` > list at the end of close(), which is incorrect. -- This message was sent by Atlassian Jira (v8.20.10#820010)