Ok, thank you. So an application state I need to handle, nothing TCP level. Appreciate you taking a look, ta!
On Thu, Jul 17, 2025 at 6:59 PM Timothy Bish <tabish...@gmail.com> wrote: > On 7/17/25 09:23, Ciaran wrote: > > Hi all, > > > > It's been a while, but I'm occasionally seeing 'fatal' connection > failures > > for my client when running inside the Azure Kubernetes environment. > > > > I've configured the Transport and ConnectionOptions like: > > > > ConnectionOptions options = new ConnectionOptions(); > > options.sslOptions().sslEnabled(this.options.isUseSSL()); > > options.transportOptions().useWebSockets(false); > > options.transportOptions().webSocketPath(""); > > options.reconnectOptions() > > .reconnectEnabled(true) > > .useReconnectBackOff(true) > > .reconnectDelay(5000) > > .maxReconnectDelay(10240001) > > .maxReconnectAttempts(12) > > .warnAfterReconnectAttempts(1); > > > > I am /not/ specifying idleTimeout on the options. > > > > During testing when I was synthesizing connection failures I saw quite > > clear logging that connection failures and retries were occurring. But in > > the specific instance within the cluster we see no such logging, just the > > exception I've included at the end of this email. > > > > We've run into a whole bunch of issues when using Java within > kubernetes, a > > fair chunk of which I've been able to trace back to problems with k8s > > silently dropping IDLE TCP connections, which we've resolved by > configuring > > the tcpKeepAlive settings in the relevant Java libraries, and I can't > help > > but think this might be a similar issue. > > > > I've had a look at the available options and I see that the > > transportoptions do support a tcpKeepAlive but I'm unclear on how we > could > > configure the periodicity, I /think/ I may need to enable epoll support, > > but it's not clear to me (sorry) how to achieve that? > > > > Before I go down this rabbit hole, would you expect the configuration > I've > > specified to behave in the manner of the exception below, no visible > > logging of retries. Could this be related to me not specifying the > > idleTimeout? > > > > I should note that this is extremely rare, we can run continuously for > > several weeks with no issue, and the load on the connection is very, very > > low. > > > > This is the exception we see. > > > > Caused by: > > > org.apache.qpid.protonj2.client.exceptions.ClientLinkRemotelyClosedException: > > Link remotely closed without explanation from the remote > > at > > > org.apache.qpid.protonj2.client.impl.ClientExceptionSupport.convertToLinkClosedException(ClientExceptionSupport.java:217) > > at > > > org.apache.qpid.protonj2.client.impl.ClientLinkType.handleRemoteCloseOrDetach(ClientLinkType.java:364) > > at > > > org.apache.qpid.protonj2.engine.impl.ProtonEndpoint.fireRemoteClose(ProtonEndpoint.java:139) > > at > > > org.apache.qpid.protonj2.engine.impl.ProtonLink.remoteDetach(ProtonLink.java:673) > > at > > > org.apache.qpid.protonj2.engine.impl.ProtonSession.remoteDetach(ProtonSession.java:545) > > at > > > org.apache.qpid.protonj2.engine.impl.ProtonConnection.handleDetach(ProtonConnection.java:547) > > at > > > org.apache.qpid.protonj2.engine.impl.ProtonPerformativeHandler.handleDetach(ProtonPerformativeHandler.java:148) > > at > > > org.apache.qpid.protonj2.engine.impl.ProtonPerformativeHandler.handleDetach(ProtonPerformativeHandler.java:43) > > at > > org.apache.qpid.protonj2.types.transport.Detach.invoke(Detach.java:132) > > at > > > org.apache.qpid.protonj2.engine.IncomingAMQPEnvelope.invoke(IncomingAMQPEnvelope.java:69) > > at > > > org.apache.qpid.protonj2.engine.impl.ProtonPerformativeHandler.handleRead(ProtonPerformativeHandler.java:68) > > at > > > org.apache.qpid.protonj2.engine.impl.ProtonEngineHandlerContext.invokeHandlerRead(ProtonEngineHandlerContext.java:187) > > at > > > org.apache.qpid.protonj2.engine.impl.ProtonEngineHandlerContext.fireRead(ProtonEngineHandlerContext.java:147) > > at > > > org.apache.qpid.protonj2.engine.impl.ProtonFrameLoggingHandler.handleRead(ProtonFrameLoggingHandler.java:101) > > at > > > org.apache.qpid.protonj2.engine.impl.ProtonEngineHandlerContext.invokeHandlerRead(ProtonEngineHandlerContext.java:187) > > at > > > org.apache.qpid.protonj2.engine.impl.ProtonEngineHandlerContext.fireRead(ProtonEngineHandlerContext.java:147) > > at > > > org.apache.qpid.protonj2.engine.impl.ProtonFrameDecodingHandler$FrameBodyParsingStage.parse(ProtonFrameDecodingHandler.java:387) > > at > > > org.apache.qpid.protonj2.engine.impl.ProtonFrameDecodingHandler$FrameSizeParsingStage.parse(ProtonFrameDecodingHandler.java:265) > > at > > > org.apache.qpid.protonj2.engine.impl.ProtonFrameDecodingHandler.handleRead(ProtonFrameDecodingHandler.java:99) > > at > > > org.apache.qpid.protonj2.engine.impl.ProtonEngineHandlerContext.invokeHandlerRead(ProtonEngineHandlerContext.java:199) > > at > > > org.apache.qpid.protonj2.engine.impl.ProtonEngineHandlerContext.fireRead(ProtonEngineHandlerContext.java:132) > > at > > > org.apache.qpid.protonj2.engine.impl.ProtonEnginePipeline.fireRead(ProtonEnginePipeline.java:301) > > at > > > org.apache.qpid.protonj2.engine.impl.ProtonEngine.ingest(ProtonEngine.java:266) > > at > > > org.apache.qpid.protonj2.engine.impl.ProtonEngine.ingest(ProtonEngine.java:54) > > at > > > org.apache.qpid.protonj2.client.impl.ClientTransportListener.transportRead(ClientTransportListener.java:59) > > at > > > org.apache.qpid.protonj2.client.transport.netty4.TcpTransport$NettyDefaultHandler.dispatchReadBuffer(TcpTransport.java:522) > > at > > > org.apache.qpid.protonj2.client.transport.netty4.TcpTransport$NettyTcpTransportHandler.channelRead0(TcpTransport.java:533) > > at > > > org.apache.qpid.protonj2.client.transport.netty4.TcpTransport$NettyTcpTransportHandler.channelRead0(TcpTransport.java:529) > > at > > > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99) > > at > > > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) > > at > > > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) > > at > > > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) > > at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1475) > > at > > io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1338) > > at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1387) > > at > > > io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:530) > > at > > > io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:469) > > at > > > io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290) > > at > > > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) > > at > > > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) > > at > > > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) > > at > > > io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) > > at > > > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440) > > at > > > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) > > at > > > io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) > > at > > > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) > > at > > > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788) > > at > > > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724) > > at > > > io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650) > > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562) > > at > > > io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) > > at > > io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) > > ... 1 more > > > This error indicates that the remote simply closed a link via a detach > frame and not the actual connection itself so I wouldn't expect the > client to attempt any reconnect in this case. The remote close of a > link requires that the application handle that and either recreate the > link (sender or receiver) or completely close out the client and rebuild > state from the start. It is possible the Azure end is closing out a > link that has been idle to long which isn't something the client can > manage as it has no insight into the remote and its configuration or > requirements. > > -- > Tim Bish > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org > For additional commands, e-mail: users-h...@qpid.apache.org > >