zhu created FLINK-33014: --------------------------- Summary: flink jobmanager raise java.io.IOException: Connection reset by peer Key: FLINK-33014 URL: https://issues.apache.org/jira/browse/FLINK-33014 Project: Flink Issue Type: Bug Affects Versions: 1.17.1 Environment: |*blob.server.port*|6124| |*classloader.resolve-order*|parent-first| |*jobmanager.execution.failover-strategy*|region| |*jobmanager.memory.heap.size*|2228014280b| |*jobmanager.memory.jvm-metaspace.size*|536870912b| |*jobmanager.memory.jvm-overhead.max*|322122552b| |*jobmanager.memory.jvm-overhead.min*|322122552b| |*jobmanager.memory.off-heap.size*|134217728b| |*jobmanager.memory.process.size*|3gb| |*jobmanager.rpc.address*|naf-flink-ms-flink-manager-1-59m7w| |*jobmanager.rpc.port*|6123| |*parallelism.default*|1| |*query.server.port*|6125| |*rest.address*|0.0.0.0| |*rest.bind-address*|0.0.0.0| |*rest.connection-timeout*|60000| |*rest.server.numThreads*|8| |*slot.request.timeout*|3000000| |*state.backend.rocksdb.localdir*|/home/nafplat/data/flinkStateStore| |*state.backend.type*|rocksdb| |*taskmanager.bind-host*|0.0.0.0| |*taskmanager.host*|0.0.0.0| |*taskmanager.memory.framework.off-heap.batch-shuffle.size*|256mb| |*taskmanager.memory.framework.off-heap.size*|512mb| |*taskmanager.memory.managed.fraction*|0.4| |*taskmanager.memory.network.fraction*|0.2| |*taskmanager.memory.process.size*|5gb| |*taskmanager.memory.task.off-heap.size*|268435456bytes| |*taskmanager.numberOfTaskSlots*|2| |*taskmanager.runtime.large-record-handler*|true| |*web.submit.enable*|true| |*web.tmpdir*|/tmp/flink-web-c1b57e2b-5426-4fb8-a9ce-5acd1cceefc9| |*web.upload.dir*|/opt/flink/nafJar| Reporter: zhu
The Flink cluster was deployed using the Docker image of Flink 1.17.1 java8. After deployment, on k8s, in standalone form, jobmanager printed this error at intervals, and taskmanager did not print any errors, There are currently no jobs running {code:java} 2023-09-01 11:34:14,293 WARN org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint [] - Unhandled exceptionjava.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[?:1.8.0_372] at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) ~[?:1.8.0_372] at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[?:1.8.0_372] at sun.nio.ch.IOUtil.read(IOUtil.java:192) ~[?:1.8.0_372] at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) ~[?:1.8.0_372] at org.apache.flink.shaded.netty4.io.netty.buffer.PooledByteBuf.setBytes(PooledByteBuf.java:258) ~[flink-dist-1.17.1.jar:1.17.1] at org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1132) ~[flink-dist-1.17.1.jar:1.17.1] at org.apache.flink.shaded.netty4.io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:357) ~[flink-dist-1.17.1.jar:1.17.1] at org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:151) [flink-dist-1.17.1.jar:1.17.1] at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788) [flink-dist-1.17.1.jar:1.17.1] at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724) [flink-dist-1.17.1.jar:1.17.1] at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650) [flink-dist-1.17.1.jar:1.17.1] at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562) [flink-dist-1.17.1.jar:1.17.1] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) [flink-dist-1.17.1.jar:1.17.1] at org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [flink-dist-1.17.1.jar:1.17.1] at java.lang.Thread.run(Thread.java:750) [?:1.8.0_372] {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)