Hi all, It's been a while, but I'm occasionally seeing 'fatal' connection failures for my client when running inside the Azure Kubernetes environment.
I've configured the Transport and ConnectionOptions like: ConnectionOptions options = new ConnectionOptions(); options.sslOptions().sslEnabled(this.options.isUseSSL()); options.transportOptions().useWebSockets(false); options.transportOptions().webSocketPath(""); options.reconnectOptions() .reconnectEnabled(true) .useReconnectBackOff(true) .reconnectDelay(5000) .maxReconnectDelay(10240001) .maxReconnectAttempts(12) .warnAfterReconnectAttempts(1); I am /not/ specifying idleTimeout on the options. During testing when I was synthesizing connection failures I saw quite clear logging that connection failures and retries were occurring. But in the specific instance within the cluster we see no such logging, just the exception I've included at the end of this email. We've run into a whole bunch of issues when using Java within kubernetes, a fair chunk of which I've been able to trace back to problems with k8s silently dropping IDLE TCP connections, which we've resolved by configuring the tcpKeepAlive settings in the relevant Java libraries, and I can't help but think this might be a similar issue. I've had a look at the available options and I see that the transportoptions do support a tcpKeepAlive but I'm unclear on how we could configure the periodicity, I /think/ I may need to enable epoll support, but it's not clear to me (sorry) how to achieve that? Before I go down this rabbit hole, would you expect the configuration I've specified to behave in the manner of the exception below, no visible logging of retries. Could this be related to me not specifying the idleTimeout? I should note that this is extremely rare, we can run continuously for several weeks with no issue, and the load on the connection is very, very low. This is the exception we see. Caused by: org.apache.qpid.protonj2.client.exceptions.ClientLinkRemotelyClosedException: Link remotely closed without explanation from the remote at org.apache.qpid.protonj2.client.impl.ClientExceptionSupport.convertToLinkClosedException(ClientExceptionSupport.java:217) at org.apache.qpid.protonj2.client.impl.ClientLinkType.handleRemoteCloseOrDetach(ClientLinkType.java:364) at org.apache.qpid.protonj2.engine.impl.ProtonEndpoint.fireRemoteClose(ProtonEndpoint.java:139) at org.apache.qpid.protonj2.engine.impl.ProtonLink.remoteDetach(ProtonLink.java:673) at org.apache.qpid.protonj2.engine.impl.ProtonSession.remoteDetach(ProtonSession.java:545) at org.apache.qpid.protonj2.engine.impl.ProtonConnection.handleDetach(ProtonConnection.java:547) at org.apache.qpid.protonj2.engine.impl.ProtonPerformativeHandler.handleDetach(ProtonPerformativeHandler.java:148) at org.apache.qpid.protonj2.engine.impl.ProtonPerformativeHandler.handleDetach(ProtonPerformativeHandler.java:43) at org.apache.qpid.protonj2.types.transport.Detach.invoke(Detach.java:132) at org.apache.qpid.protonj2.engine.IncomingAMQPEnvelope.invoke(IncomingAMQPEnvelope.java:69) at org.apache.qpid.protonj2.engine.impl.ProtonPerformativeHandler.handleRead(ProtonPerformativeHandler.java:68) at org.apache.qpid.protonj2.engine.impl.ProtonEngineHandlerContext.invokeHandlerRead(ProtonEngineHandlerContext.java:187) at org.apache.qpid.protonj2.engine.impl.ProtonEngineHandlerContext.fireRead(ProtonEngineHandlerContext.java:147) at org.apache.qpid.protonj2.engine.impl.ProtonFrameLoggingHandler.handleRead(ProtonFrameLoggingHandler.java:101) at org.apache.qpid.protonj2.engine.impl.ProtonEngineHandlerContext.invokeHandlerRead(ProtonEngineHandlerContext.java:187) at org.apache.qpid.protonj2.engine.impl.ProtonEngineHandlerContext.fireRead(ProtonEngineHandlerContext.java:147) at org.apache.qpid.protonj2.engine.impl.ProtonFrameDecodingHandler$FrameBodyParsingStage.parse(ProtonFrameDecodingHandler.java:387) at org.apache.qpid.protonj2.engine.impl.ProtonFrameDecodingHandler$FrameSizeParsingStage.parse(ProtonFrameDecodingHandler.java:265) at org.apache.qpid.protonj2.engine.impl.ProtonFrameDecodingHandler.handleRead(ProtonFrameDecodingHandler.java:99) at org.apache.qpid.protonj2.engine.impl.ProtonEngineHandlerContext.invokeHandlerRead(ProtonEngineHandlerContext.java:199) at org.apache.qpid.protonj2.engine.impl.ProtonEngineHandlerContext.fireRead(ProtonEngineHandlerContext.java:132) at org.apache.qpid.protonj2.engine.impl.ProtonEnginePipeline.fireRead(ProtonEnginePipeline.java:301) at org.apache.qpid.protonj2.engine.impl.ProtonEngine.ingest(ProtonEngine.java:266) at org.apache.qpid.protonj2.engine.impl.ProtonEngine.ingest(ProtonEngine.java:54) at org.apache.qpid.protonj2.client.impl.ClientTransportListener.transportRead(ClientTransportListener.java:59) at org.apache.qpid.protonj2.client.transport.netty4.TcpTransport$NettyDefaultHandler.dispatchReadBuffer(TcpTransport.java:522) at org.apache.qpid.protonj2.client.transport.netty4.TcpTransport$NettyTcpTransportHandler.channelRead0(TcpTransport.java:533) at org.apache.qpid.protonj2.client.transport.netty4.TcpTransport$NettyTcpTransportHandler.channelRead0(TcpTransport.java:529) at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1475) at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1338) at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1387) at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:530) at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:469) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ... 1 more