[ https://issues.apache.org/jira/browse/YARN-1469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
qus-jiawei resolved YARN-1469. ------------------------------ Resolution: Duplicate > ApplicationMaster crash cause the TaskAttemptImpl couldn't handle the > TA_TOO_MANY_FETCH_FAILURE at KILLED > ---------------------------------------------------------------------------------------------------------- > > Key: YARN-1469 > URL: https://issues.apache.org/jira/browse/YARN-1469 > Project: Hadoop YARN > Issue Type: Bug > Reporter: qus-jiawei > Attachments: job_1384857622207_222215-amlog.txt > > > This bug could happen when using demission command to demission an > nodemanager.The detail is bellow: > 1.one job running happily on the yarn cluster and some MapTask finish on > machine A then begin to schedule the reduce task.Now,the MapTask's state is > successed. > 2.The hadoop admin demission machine A 's NodeManager. > 3.The ApplicationMaster find the some MapTask hived finish on a demissioned > nodemanager, change this MapTask 's state to KILLED. > 4.Some running ReduceTask couldn't get the data from MapTask throw an event > TA_TOO_MANY_FETCH_FAILURE to TaskAttemptImpl. > 5.TaskAttemptImpl couldn't handle TA_TOO_MANY_FETCH_FAILURE at KILLED state > then throw an exception,cause the ApplicationMaster turn to ERROR. > I think TaskAttemptImpl could just ignore the TA_TOO_MANY_FETCH_FAILURE > event at KILLED state -- This message was sent by Atlassian JIRA (v6.1#6144)