Hi there,

In recent, our production fink jobs observed some weird performance issue.
When job tailing kafka source failed and try to catch up, asyncIO after
event trigger get much higher load on task thread. Since each TM allocated
two virtual CPU in docker, my assumption was akka message between JM and TM
shouldn't be impacted.

What I observed was TM get closed and keep restart with same error message
below. Any suggestion is appreciated!


> org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException:
> Connection unexpectedly closed by remote task manager '
> ​xxxxxxx
> /
> ​xxxxxxx
> :5841'. This might indicate that the remote task manager was lost.
> at org.apache.flink.runtime.io.network.netty.
> PartitionRequestClientHandler.channelInactive(
> PartitionRequestClientHandler.java:115)
> at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(
> AbstractChannelHandlerContext.java:237)
> at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(
> AbstractChannelHandlerContext.java:223)
> at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(
> ChannelInboundHandlerAdapter.java:75)
> at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(
> AbstractChannelHandlerContext.java:237)
> at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(
> AbstractChannelHandlerContext.java:223)
> at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(
> ByteToMessageDecoder.java:294)
> at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(
> AbstractChannelHandlerContext.java:237)
> at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(
> AbstractChannelHandlerContext.java:223)
> at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(
> DefaultChannelPipeline.java:829)
> at io.netty.channel.AbstractChannel$AbstractUnsafe$7.run(
> AbstractChannel.java:610)
> at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(
> SingleThreadEventExecutor.java:357)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
> at io.netty.util.concurrent.SingleThreadEventExecutor$2.
> run(SingleThreadEventExecutor.java:111)
> at java.lang.Thread.run(Thread.java:748)


​Chen​

Reply via email to