Greg Harris created KAFKA-16765:
-----------------------------------

             Summary: NioEchoServer leaks accepted SocketChannel instances due 
to race condition
                 Key: KAFKA-16765
                 URL: https://issues.apache.org/jira/browse/KAFKA-16765
             Project: Kafka
          Issue Type: Bug
          Components: core, unit tests
    Affects Versions: 3.8.0
            Reporter: Greg Harris


The NioEchoServer has an AcceptorThread that calls accept() to open new 
SocketChannel instances and insert them into the `newChannels` List, and a main 
thread that drains the `newChannels` List and moves them to the 
`socketChannels` List.

During shutdown, the serverSocketChannel is closed, which causes both threads 
to exit their while loops. It is possible for the NioEchoServer main thread to 
sense the serverSocketChannel close and terminate before the Acceptor thread 
does, and for the Acceptor thread to put a SocketChannel in `newChannels` 
before terminating. This instance is never closed by either thread, because it 
is never moved to `socketChannels`.

A precise execution order that has this leak is:
1. NioEchoServer thread locks `newChannels`.
2. Acceptor thread accept() completes, and the SocketChannel is created
3. Acceptor thread blocks waiting for the `newChannels` lock
4. NioEchoServer thread releases the `newChannels` lock and does some processing
5. NioEchoServer#close() is called, which closes the serverSocketChannel
6. NioEchoServer thread checks serverSocketChannel.isOpen() and then terminates
7. Acceptor thread acquires the `newChannels` lock and adds the SocketChannel 
to `newChannels`.
8. Acceptor thread checks serverSocketChannel.isOpen() and then terminates.
9. NioEchoServer#close() stops blocking now that both other threads have 
terminated.

The end result is that the leaked socket is left open in the `newChannels` list 
at the end of close(), which is incorrect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to