Gary Yao created FLINK-8488: ------------------------------- Summary: Dispatcher does not recover jobs Key: FLINK-8488 URL: https://issues.apache.org/jira/browse/FLINK-8488 Project: Flink Issue Type: Bug Components: Distributed Coordination Affects Versions: 1.5.0 Environment: 776af4a882c85926fc0764b702fec717c675e34c Reporter: Gary Yao Fix For: 1.5.0
Dispatcher does not recover jobs on failover. *Steps to reproduce*: # {{bin/start-cluster.sh flip6}} # bin/flink run -p1 -flip6 examples/batch/WordCount.jar --input /path/to/largefile.txt # Wait until job is running then run {{bin/jobmanager.sh stop flip6 && bin/jobmanager.sh start flip6}} # Wait until leader is elected and verify that no jobs are running. *Analysis* * Dispatcher checks on {{submitJob}} whether the job scheduling status is {{PENDING}} and only then allows resubmission of the job. However, the job is marked as {{RUNNING}} in ZooKeeper. -- This message was sent by Atlassian JIRA (v7.6.3#76005)