[ https://issues.apache.org/jira/browse/MAPREDUCE-3711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Joseph Evans updated MAPREDUCE-3711: ------------------------------------------- Attachment: MR-3711.txt OK I figured out the issue. It appears that fs.rename on the local file system will create parent directories for you, where as on HDFS it does not. Also I found out that if there is an error durring recover, it can cause the AM to fail, which does not result in another retry. I will try to reproduce the issue again, and file a JIRA for it. > AppMaster recovery for Medium to large jobs take long time > ---------------------------------------------------------- > > Key: MAPREDUCE-3711 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3711 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 > Affects Versions: 0.23.0, 0.24.0 > Reporter: Siddharth Seth > Assignee: Robert Joseph Evans > Priority: Blocker > Attachments: MR-3711.txt, MR-3711.txt > > > Reported by [~karams] > yarn.resourcemanager.am.max-retries=2 > Ran test cases with sort job on 350 scale having 16800 maps and 680 reduces -: > 1. After 70 secs of Job Sumbission Am is killed using kill -9, around 3900 > maps were completed and 680 reduces were > scheduled, Second AM got restart. Job got completed in 980 secs. AM took very > less time to recover. > 2. After 150 secs of Job Sumbission AM is killed using kill -9, around 90% > maps were completed and 680 reduces were > scheduled , Second AM got restart Job got completed in 1000 secs. AM got > revocer. > 3. After 150 secs of Job Sumbission AM as killed using kill -9, almost all > maps were completed and only 680 reduces > were running, Recovery was too slow, AM was still revocering after 1hr :40 > mis when I killed the run. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira