Hi,

We are observing "Insufficient number of Network Buffers" issue
Sporadically when Flink is upgraded from 1.4.2 to 1.8.2.
The state of the tasks with this issue translated from DEPLOYING to FAILED.
Whenever this issue occurs, the job manager restarts. Sometimes, the issue
goes away after the restart.
As we are not getting the issue consistently, we are in a dilemma of
whether to change the memory configurations or not.

Min recommended No. of Network Buffers: (8 * 8) * 8 * 4 = 2048
The exception says that 13112 no. of network buffers are present, which is
6x the recommendation.

Is reducing the no. of shuffles the only way to reduce the no. of network
buffers required?

Thanks,
Rahul

configs:
env: Kubernetes
Flink: 1.8.2
using default configs for memory.fraction, memory.min, memory.max.
using 8 TM, 8 slots/TM
Each TM is running with 1 core, 4 GB Memory.

Exception:
java.io.IOException: Insufficient number of network buffers: required 2,
but only 0 available. The total number of network buffers is currently set
to 13112 of 32768 bytes each. You can increase this number by setting the
configuration keys 'taskmanager.network.memory.fraction',
'taskmanager.network.memory.min', and 'taskmanager.network.memory.max'.
at
org.apache.flink.runtime.io.network.buffer.NetworkBufferPool.requestMemorySegments(NetworkBufferPool.java:138)
at
org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.assignExclusiveSegments(SingleInputGate.java:311)
at
org.apache.flink.runtime.io.network.NetworkEnvironment.setupInputGate(NetworkEnvironment.java:271)
at
org.apache.flink.runtime.io.network.NetworkEnvironment.registerTask(NetworkEnvironment.java:224)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:614)
at java.lang.Thread.run(Thread.java:748)

Reply via email to