Yun Tang created FLINK-32027:
--------------------------------

             Summary: Batch jobs could hang at shuffle phase when max 
parallelism is really large
                 Key: FLINK-32027
                 URL: https://issues.apache.org/jira/browse/FLINK-32027
             Project: Flink
          Issue Type: Bug
          Components: Runtime / Network
    Affects Versions: 1.17.0
            Reporter: Yun Tang
             Fix For: 1.17.1
         Attachments: image-2023-05-08-11-12-58-361.png

In batch stream mode with adaptive batch schedule mode, If we set the max 
parallelism large as 32768 (pipeline.max-parallelism), the job could hang at 
the shuffle phase:

It would hang for a long time and show "No bytes sent":
 !image-2023-05-08-11-12-58-361.png! 

After some time to debug, we can see the downstream operator did not receive 
the end-of-partition event.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to