@tillrohrmann it is better to say that `JobMaster` will be overwhelmed by too 
many rpc request.

This issue is filed during a benchmark of the job scheduling performance with a 
2000x2000 ALL-to-ALL streaming(EAGER) job. The input data is empty so that the 
tasks finishes soon after started.

In this case JM shows slow RPC responses and TM/RM heartbeats to JM will 
finally timeout. Digging out the reason, there are ~2,000,000 
`requestPartitionState` messages triggered by 
`triggerPartitionProducerStateCheck` in a short time, which overwhelms JM RPC 
main thread. This is due to downstream tasks can be started earlier than 
upstream tasks in EAGER scheduling.

For you second question, the task can just keep waiting for a while and 
retrying if the partition does not exist. There are two cases when the 
partition does not exist: 1. the partition is not started yet 2. the partition 
is failed. In case 1, retry works. In case 2, a task failover will soon happen 
and cancel the downstream tasks as well.

[ Full content available at: https://github.com/apache/flink/pull/6680 ]
This message was relayed via gitbox.apache.org for [email protected]

Reply via email to