[
https://issues.apache.org/jira/browse/FLINK-8488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gary Yao updated FLINK-8488:
----------------------------
Description:
Dispatcher does not recover jobs on failover (FLIP-6 mode).
*Steps to reproduce*:
# {{bin/start-cluster.sh flip6}}
# bin/flink run -p1 -flip6 examples/batch/WordCount.jar --input
/path/to/largefile.txt
# Wait until job is running then run {{bin/jobmanager.sh stop flip6 &&
bin/jobmanager.sh start flip6}}
# Wait until leader is elected and verify that no jobs are running.
*Analysis*
* Dispatcher checks on {{submitJob}} whether the job scheduling status is
{{PENDING}} and only then allows resubmission of the job. However, the job is
marked as {{RUNNING}} in ZooKeeper.
was:
Dispatcher does not recover jobs on failover (FLIP-6).
*Steps to reproduce*:
# {{bin/start-cluster.sh flip6}}
# bin/flink run -p1 -flip6 examples/batch/WordCount.jar --input
/path/to/largefile.txt
# Wait until job is running then run {{bin/jobmanager.sh stop flip6 &&
bin/jobmanager.sh start flip6}}
# Wait until leader is elected and verify that no jobs are running.
*Analysis*
* Dispatcher checks on {{submitJob}} whether the job scheduling status is
{{PENDING}} and only then allows resubmission of the job. However, the job is
marked as {{RUNNING}} in ZooKeeper.
> Dispatcher does not recover jobs
> --------------------------------
>
> Key: FLINK-8488
> URL: https://issues.apache.org/jira/browse/FLINK-8488
> Project: Flink
> Issue Type: Bug
> Components: Distributed Coordination
> Affects Versions: 1.5.0
> Environment: 776af4a882c85926fc0764b702fec717c675e34c
> Reporter: Gary Yao
> Priority: Blocker
> Labels: flip-6
> Fix For: 1.5.0
>
>
> Dispatcher does not recover jobs on failover (FLIP-6 mode).
> *Steps to reproduce*:
> # {{bin/start-cluster.sh flip6}}
> # bin/flink run -p1 -flip6 examples/batch/WordCount.jar --input
> /path/to/largefile.txt
> # Wait until job is running then run {{bin/jobmanager.sh stop flip6 &&
> bin/jobmanager.sh start flip6}}
> # Wait until leader is elected and verify that no jobs are running.
> *Analysis*
> * Dispatcher checks on {{submitJob}} whether the job scheduling status is
> {{PENDING}} and only then allows resubmission of the job. However, the job is
> marked as {{RUNNING}} in ZooKeeper.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)