[ https://issues.apache.org/jira/browse/MAPREDUCE-3286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Eric Payne resolved MAPREDUCE-3286. ----------------------------------- Resolution: Invalid Target Version/s: (was: ) Release Note: (was: New Yarn configuration property: Name: yarn.app.mapreduce.am.scheduler.connection.retries Description: Number of times AM should retry to contact RM if connection is lost.) RM has been refactored and restructured a few times in the past 3.5 years. Closing as invalid. > Unit tests for MAPREDUCE-3186 - User jobs are getting hanged if the Resource > manager process goes down and comes up while job is getting executed. > -------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: MAPREDUCE-3286 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3286 > Project: Hadoop Map/Reduce > Issue Type: Test > Components: mrv2 > Affects Versions: 0.23.0 > Environment: linux > Reporter: Eric Payne > Assignee: Eric Payne > Labels: test > > If the resource manager is restarted while the job execution is in progress, > the job is getting hanged. > UI shows the job as running. > In the RM log, it is throwing an error "ERROR > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: > AppAttemptId doesnt exist in cache appattempt_1318579738195_0004_000001" > In the console MRAppMaster and Runjar processes are not getting killed -- This message was sent by Atlassian JIRA (v6.3.4#6332)