[ https://issues.apache.org/jira/browse/TEZ-1559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jeff Zhang updated TEZ-1559: ---------------------------- Summary: Add system tests for AM recovery (was: System tests for AM recovery) > Add system tests for AM recovery > -------------------------------- > > Key: TEZ-1559 > URL: https://issues.apache.org/jira/browse/TEZ-1559 > Project: Apache Tez > Issue Type: Sub-task > Reporter: Jeff Zhang > Assignee: Jeff Zhang > > * [Fine-grained recovery task-level] In a vertex, task 0 is done task 1 is > running. History flush happens. AM dies. Once AM is recovered, task 0 is not > re-run. Task 1 is re-run. > * [Data movement types] Test AM recovery with all data movement types > including 1-1, broadcast, scatter-gather with/without shuffle. AM should die > in 2 scenarios: first-vertex task finishes completely and partially. > * [Kill AM many times] Set AM max attempt to high number. Kill many attempts. > Last AM can still be recovered with latest AM history data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)