[ https://issues.apache.org/jira/browse/FLINK-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Maximilian Michels resolved FLINK-4141. --------------------------------------- Resolution: Fixed Fix Version/s: 1.1.0 Resolved with f722b73772eb66cdb79a288300e38ff7026c7e1f > TaskManager failures not always recover when killed during an > ApplicationMaster failure in HA mode on Yarn > ---------------------------------------------------------------------------------------------------------- > > Key: FLINK-4141 > URL: https://issues.apache.org/jira/browse/FLINK-4141 > Project: Flink > Issue Type: Bug > Affects Versions: 1.0.3 > Reporter: Stefan Richter > Assignee: Maximilian Michels > Fix For: 1.1.0 > > > High availability on Yarn often fails to recover in the following test > scenario: > 1. Kill application master process. > 2. Then, while application master is recovering, randomly kill several task > managers (with some delay). > After the application master recovered, not all the killed task manager are > brought back and no further attempts are made the restart them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)