[jira] [Commented] (KAFKA-16765) NioEchoServer leaks accepted SocketChannel instances due to race condition

2024-06-30 Thread zhengke zhou (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17860996#comment-17860996
 ] 

zhengke zhou commented on KAFKA-16765:
--

[Greg 
Harris|https://issues.apache.org/jira/secure/ViewProfile.jspa?name=gharris1727] 
In this ticket, we want to avoid the situation where the Producer: 
acceptorThread creates a channel and adds it to SocketChannels that will never 
be consumed by the main thread. To resolve this, we can clear up 
closeSocketChannels after acceptorThread exited.

> NioEchoServer leaks accepted SocketChannel instances due to race condition
> --
>
> Key: KAFKA-16765
> URL: https://issues.apache.org/jira/browse/KAFKA-16765
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 3.8.0
>Reporter: Greg Harris
>Assignee: zhengke zhou
>Priority: Minor
>  Labels: newbie
>
> The NioEchoServer has an AcceptorThread that calls accept() to open new 
> SocketChannel instances and insert them into the `newChannels` List, and a 
> main thread that drains the `newChannels` List and moves them to the 
> `socketChannels` List.
> During shutdown, the serverSocketChannel is closed, which causes both threads 
> to exit their while loops. It is possible for the NioEchoServer main thread 
> to sense the serverSocketChannel close and terminate before the Acceptor 
> thread does, and for the Acceptor thread to put a SocketChannel in 
> `newChannels` before terminating. This instance is never closed by either 
> thread, because it is never moved to `socketChannels`.
> A precise execution order that has this leak is:
> 1. NioEchoServer thread locks `newChannels`.
> 2. Acceptor thread accept() completes, and the SocketChannel is created
> 3. Acceptor thread blocks waiting for the `newChannels` lock
> 4. NioEchoServer thread releases the `newChannels` lock and does some 
> processing
> 5. NioEchoServer#close() is called, which closes the serverSocketChannel
> 6. NioEchoServer thread checks serverSocketChannel.isOpen() and then 
> terminates
> 7. Acceptor thread acquires the `newChannels` lock and adds the SocketChannel 
> to `newChannels`.
> 8. Acceptor thread checks serverSocketChannel.isOpen() and then terminates.
> 9. NioEchoServer#close() stops blocking now that both other threads have 
> terminated.
> The end result is that the leaked socket is left open in the `newChannels` 
> list at the end of close(), which is incorrect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-16765) NioEchoServer leaks accepted SocketChannel instances due to race condition

2024-06-29 Thread zhengke zhou (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17860952#comment-17860952
 ] 

zhengke zhou commented on KAFKA-16765:
--

If we can make *main* thread always close after *AcceptorThread* that ** seems 
like can be fixed.

I will try to fix this.

> NioEchoServer leaks accepted SocketChannel instances due to race condition
> --
>
> Key: KAFKA-16765
> URL: https://issues.apache.org/jira/browse/KAFKA-16765
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 3.8.0
>Reporter: Greg Harris
>Priority: Minor
>  Labels: newbie
>
> The NioEchoServer has an AcceptorThread that calls accept() to open new 
> SocketChannel instances and insert them into the `newChannels` List, and a 
> main thread that drains the `newChannels` List and moves them to the 
> `socketChannels` List.
> During shutdown, the serverSocketChannel is closed, which causes both threads 
> to exit their while loops. It is possible for the NioEchoServer main thread 
> to sense the serverSocketChannel close and terminate before the Acceptor 
> thread does, and for the Acceptor thread to put a SocketChannel in 
> `newChannels` before terminating. This instance is never closed by either 
> thread, because it is never moved to `socketChannels`.
> A precise execution order that has this leak is:
> 1. NioEchoServer thread locks `newChannels`.
> 2. Acceptor thread accept() completes, and the SocketChannel is created
> 3. Acceptor thread blocks waiting for the `newChannels` lock
> 4. NioEchoServer thread releases the `newChannels` lock and does some 
> processing
> 5. NioEchoServer#close() is called, which closes the serverSocketChannel
> 6. NioEchoServer thread checks serverSocketChannel.isOpen() and then 
> terminates
> 7. Acceptor thread acquires the `newChannels` lock and adds the SocketChannel 
> to `newChannels`.
> 8. Acceptor thread checks serverSocketChannel.isOpen() and then terminates.
> 9. NioEchoServer#close() stops blocking now that both other threads have 
> terminated.
> The end result is that the leaked socket is left open in the `newChannels` 
> list at the end of close(), which is incorrect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-16765) NioEchoServer leaks accepted SocketChannel instances due to race condition

2024-05-14 Thread Greg Harris (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17846475#comment-17846475
 ] 

Greg Harris commented on KAFKA-16765:
-

This is also a bug in EchoServer: 
[https://github.com/apache/kafka/blob/cb968845ecb3cb0982182d9dd437ecf652fe38d3/clients/src/test/java/org/apache/kafka/common/network/EchoServer.java#L76-L81]
 and ServerShutdownTest: 
[https://github.com/apache/kafka/blob/cb968845ecb3cb0982182d9dd437ecf652fe38d3/core/src/test/scala/unit/kafka/server/ServerShutdownTest.scala#L274-L275]
 except those don't require a race condition to happen.

> NioEchoServer leaks accepted SocketChannel instances due to race condition
> --
>
> Key: KAFKA-16765
> URL: https://issues.apache.org/jira/browse/KAFKA-16765
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 3.8.0
>Reporter: Greg Harris
>Priority: Minor
>
> The NioEchoServer has an AcceptorThread that calls accept() to open new 
> SocketChannel instances and insert them into the `newChannels` List, and a 
> main thread that drains the `newChannels` List and moves them to the 
> `socketChannels` List.
> During shutdown, the serverSocketChannel is closed, which causes both threads 
> to exit their while loops. It is possible for the NioEchoServer main thread 
> to sense the serverSocketChannel close and terminate before the Acceptor 
> thread does, and for the Acceptor thread to put a SocketChannel in 
> `newChannels` before terminating. This instance is never closed by either 
> thread, because it is never moved to `socketChannels`.
> A precise execution order that has this leak is:
> 1. NioEchoServer thread locks `newChannels`.
> 2. Acceptor thread accept() completes, and the SocketChannel is created
> 3. Acceptor thread blocks waiting for the `newChannels` lock
> 4. NioEchoServer thread releases the `newChannels` lock and does some 
> processing
> 5. NioEchoServer#close() is called, which closes the serverSocketChannel
> 6. NioEchoServer thread checks serverSocketChannel.isOpen() and then 
> terminates
> 7. Acceptor thread acquires the `newChannels` lock and adds the SocketChannel 
> to `newChannels`.
> 8. Acceptor thread checks serverSocketChannel.isOpen() and then terminates.
> 9. NioEchoServer#close() stops blocking now that both other threads have 
> terminated.
> The end result is that the leaked socket is left open in the `newChannels` 
> list at the end of close(), which is incorrect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)