Gaurav Narula created KAFKA-16305:
-------------------------------------

             Summary: # Optimisation in SslTransportLayer:handshakeUnwrap 
stalls TLS handshake
                 Key: KAFKA-16305
                 URL: https://issues.apache.org/jira/browse/KAFKA-16305
             Project: Kafka
          Issue Type: Bug
    Affects Versions: 3.6.1, 3.7.0
            Reporter: Gaurav Narula


Kafka allows users to configure custom {{SSLEngine}} via the 
{{SslEngineFactory}} interface. There have been attempts to use an OpenSSL 
based {{SSLEngine}} using {{netty-handler}} over the JDK based implementation 
for performance reasons.

While trying to use a Netty/Openssl based SSLEngine, we observe that the server 
hangs while performing the TLS handshake.  We observe the following logs


{code}
2024-02-26 01:40:00,117 
data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
SslTransportLayer    - [SslTransportLayer 
channelId=127.0.0.1:60045-127.0.0.1:60046-0 
key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 
127.0.0.1:60045-127.0.0.1:60046-0 state NOT_INITIALIZED
2024-02-26 01:40:00,117 
data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [INFO ] 
Selector             - [SocketServer listenerType=ZK_BROKER, nodeId=101] 
XX-Gaurav: calling prepare channelId 127.0.0.1:60045-127.0.0.1:60046-0
2024-02-26 01:40:00,117 
data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
SslTransportLayer    - [SslTransportLayer 
channelId=127.0.0.1:60045-127.0.0.1:60046-0 
key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 
127.0.0.1:60045-127.0.0.1:60046-0 state NOT_INITIALIZED
2024-02-26 01:40:00,117 
data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
SslTransportLayer    - [SslTransportLayer 
channelId=127.0.0.1:60045-127.0.0.1:60046-0 
key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 
127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
2024-02-26 01:40:00,117 
data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
SslTransportLayer    - [SslTransportLayer 
channelId=127.0.0.1:60045-127.0.0.1:60046-0 
key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
interestOps=1, readyOps=0] SSLHandshake NEED_UNWRAP channelId 
127.0.0.1:60045-127.0.0.1:60046-0, appReadBuffer pos 0, netReadBuffer pos 0, 
netWriteBuffer pos 0
2024-02-26 01:40:00,117 
data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
SslTransportLayer    - [SslTransportLayer 
channelId=127.0.0.1:60045-127.0.0.1:60046-0 
key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
interestOps=1, readyOps=0] SSLHandshake handshakeUnwrap channelId 
127.0.0.1:60045-127.0.0.1:60046-0 doRead true
2024-02-26 01:40:00,118 
data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
SslTransportLayer    - [SslTransportLayer 
channelId=127.0.0.1:60045-127.0.0.1:60046-0 
key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
interestOps=1, readyOps=0] SSLHandshake post handshakeUnwrap: channelId 
127.0.0.1:60045-127.0.0.1:60046-0 handshakeStatus NEED_UNWRAP status 
BUFFER_UNDERFLOW read 0
2024-02-26 01:40:00,118 
data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
SslTransportLayer    - [SslTransportLayer 
channelId=127.0.0.1:60045-127.0.0.1:60046-0 
key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
interestOps=1, readyOps=0] SSLHandshake post unwrap NEED_UNWRAP channelId 
127.0.0.1:60045-127.0.0.1:60046-0, handshakeResult Status = BUFFER_UNDERFLOW 
HandshakeStatus = NEED_UNWRAP bytesConsumed = 0 bytesProduced = 0, 
appReadBuffer pos 0, netReadBuffer pos 0, netWriteBuffer pos 0 handshakeStatus 
NEED_UNWRAP
2024-02-26 01:40:00,118 
data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
SslTransportLayer    - [SslTransportLayer 
channelId=127.0.0.1:60045-127.0.0.1:60046-0 
key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 
127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
2024-02-26 01:40:00,118 
data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
SslTransportLayer    - [SslTransportLayer 
channelId=127.0.0.1:60045-127.0.0.1:60046-0 
key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 
127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
2024-02-26 01:40:00,118 
data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
SslTransportLayer    - [SslTransportLayer 
channelId=127.0.0.1:60045-127.0.0.1:60046-0 
key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 
127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
2024-02-26 01:40:00,118 
data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
SslTransportLayer    - [SslTransportLayer 
channelId=127.0.0.1:60045-127.0.0.1:60046-0 
key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 
127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
2024-02-26 01:40:00,118 
data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
SslTransportLayer    - [SslTransportLayer 
channelId=127.0.0.1:60045-127.0.0.1:60046-0 
key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 
127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
2024-02-26 01:40:00,119 
data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
SslTransportLayer    - [SslTransportLayer 
channelId=127.0.0.1:60045-127.0.0.1:60046-0 
key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 
127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
2024-02-26 01:40:00,119 
data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
SslTransportLayer    - [SslTransportLayer 
channelId=127.0.0.1:60045-127.0.0.1:60046-0 
key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 
127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
2024-02-26 01:40:00,120 
data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [INFO ] 
Selector             - [SocketServer listenerType=ZK_BROKER, nodeId=101] 
XX-Gaurav: calling prepare channelId 127.0.0.1:60045-127.0.0.1:60046-0
2024-02-26 01:40:00,120 
data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
SslTransportLayer    - [SslTransportLayer 
channelId=127.0.0.1:60045-127.0.0.1:60046-0 
key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 
127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
2024-02-26 01:40:00,120 
data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
SslTransportLayer    - [SslTransportLayer 
channelId=127.0.0.1:60045-127.0.0.1:60046-0 
key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 
127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
2024-02-26 01:40:00,120 
data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
SslTransportLayer    - [SslTransportLayer 
channelId=127.0.0.1:60045-127.0.0.1:60046-0 
key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
interestOps=1, readyOps=0] SSLHandshake NEED_UNWRAP channelId 
127.0.0.1:60045-127.0.0.1:60046-0, appReadBuffer pos 0, netReadBuffer pos 383, 
netWriteBuffer pos 0
2024-02-26 01:40:00,120 
data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
SslTransportLayer    - [SslTransportLayer 
channelId=127.0.0.1:60045-127.0.0.1:60046-0 
key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
interestOps=1, readyOps=0] SSLHandshake handshakeUnwrap channelId 
127.0.0.1:60045-127.0.0.1:60046-0 doRead true
2024-02-26 01:40:00,122 
data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
SslTransportLayer    - [SslTransportLayer 
channelId=127.0.0.1:60045-127.0.0.1:60046-0 
key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
interestOps=1, readyOps=0] SSLHandshake post handshakeUnwrap: channelId 
127.0.0.1:60045-127.0.0.1:60046-0 handshakeStatus NEED_UNWRAP status OK read 0
2024-02-26 01:40:00,122 
data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] 
SslTransportLayer    - [SslTransportLayer 
channelId=127.0.0.1:60045-127.0.0.1:60046-0 
key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 
remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, 
interestOps=1, readyOps=0] SSLHandshake post unwrap NEED_UNWRAP channelId 
127.0.0.1:60045-127.0.0.1:60046-0, handshakeResult Status = OK HandshakeStatus 
= NEED_TASK bytesConsumed = 383 bytesProduced = 0, appReadBuffer pos 0, 
netReadBuffer pos 0, netWriteBuffer pos 0 handshakeStatus NEED_UNWRAP
{code}

Lines 21 and 22 are of interest here as we observe that:
1. {{handshakeStatus}} is {{NEED_UNWRAP}} and
2. {{netReadBuffer.position() == 0}}

The do-while loop in {{handshakeUnwrap()}} is 
[exited|https://github.com/apache/kafka/blob/9bc9fae9425e4dac64ef078cd3a4e7e6e09cc45a/clients/src/main/java/org/apache/kafka/common/network/SslTransportLayer.java#L529]
 because of (2) and as a result, the {{SSLEngine}}'s state machine isn't 
handled properly.

We should therefore, avoid the optimisation and simplify the condition to be 
just {{while(cont)}}.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to