[ https://issues.apache.org/jira/browse/MAPREDUCE-6277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14367549#comment-14367549 ]
Jason Lowe commented on MAPREDUCE-6277: --------------------------------------- Thanks for updating the patch, Chang. One last nit with the test: there's a bunch of code in the catch clause that doesn't need to be there. The only thing that I would expect to be there is the exception message assert. Everything else can be moved outside of the catch clause. > Job can post multiple history files if attempt loses connection to the RM > ------------------------------------------------------------------------- > > Key: MAPREDUCE-6277 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6277 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am > Affects Versions: 2.7.0 > Reporter: Chang Li > Assignee: Chang Li > Attachments: MAPREDUCE-6277.patch, YARN-3335.1.patch, > YARN-3335.2.patch > > > Related to a fixed issue MAPREDUCE-6230 which cause a Job to get into error > state. In that situation Job's second or some later attempt could succeed but > those later attempts' history file will all be lost. Because the first > attempt in error state will copy its history file to intermediate dir while > mistakenly think of itself as lastattempt. Jobhistory server will later move > the history file of that error attempt from intermediate dir to done dir > while ignore the rest of that job attempt's later history files in > intermediate dir. -- This message was sent by Atlassian JIRA (v6.3.4#6332)