[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sultan Alamro updated MAPREDUCE-7245:
-------------------------------------
    Description: 
When we set *mapreduce.map.maxattempts* to 1, the reduce tasks should ignore 
the output of failed tasks as state it in EventFetch class. However, it turns 
out that this only happens when a map task transitions from RUNNING to FAILED, 
not from SCHEDULED to FAILED. 

 

I think this problem can be solved in TaskImpl.java file by adding an else 
statement if there is not container assigned.

 

if (attempt.getNodeHttpAddress() != null) {
 TaskAttemptCompletionEvent tce = recordFactory
 .newRecordInstance(TaskAttemptCompletionEvent.class);
 tce.setEventId(-1);
 String scheme = (encryptedShuffle) ? "https://"; : "http://";;
 tce.setMapOutputServerAddress(StringInterner.weakIntern(scheme
 + attempt.getNodeHttpAddress().split(":")[0] + ":"
 + attempt.getShufflePort()));
 tce.setStatus(status);
 tce.setAttemptId(attempt.getID());
 int runTime = 0;
 if (attempt.getFinishTime() != 0 && attempt.getLaunchTime() !=0)
 runTime = (int)(attempt.getFinishTime() - attempt.getLaunchTime());
 tce.setAttemptRunTime(runTime);

//raise the event to job so that it adds the completion event to its
 //data structures
 eventHandler.handle(new JobTaskAttemptCompletedEvent(tce));
 } else {
 TaskAttemptCompletionEvent tce = recordFactory
 .newRecordInstance(TaskAttemptCompletionEvent.class);
 tce.setEventId(-1);
 tce.setStatus(status);
 tce.setAttemptId(attempt.getID());
 eventHandler.handle(new JobTaskAttemptCompletedEvent(tce));
 }

  was:When we set *mapreduce.map.maxattempts* to 1, the reduce tasks should 
ignore the output of failed tasks as state it in EventFetch class. However, it 
turns out that this only happens when a map task transitions from RUNNING to 
FAILED, not from SCHEDULED to FAILED


> Reduce phase does not continue processing with failed SCHEDULED Map tasks
> -------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-7245
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7245
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 2.7.2, 3.2.1
>            Reporter: Sultan Alamro
>            Priority: Major
>
> When we set *mapreduce.map.maxattempts* to 1, the reduce tasks should ignore 
> the output of failed tasks as state it in EventFetch class. However, it turns 
> out that this only happens when a map task transitions from RUNNING to 
> FAILED, not from SCHEDULED to FAILED. 
>  
> I think this problem can be solved in TaskImpl.java file by adding an else 
> statement if there is not container assigned.
>  
> if (attempt.getNodeHttpAddress() != null) {
>  TaskAttemptCompletionEvent tce = recordFactory
>  .newRecordInstance(TaskAttemptCompletionEvent.class);
>  tce.setEventId(-1);
>  String scheme = (encryptedShuffle) ? "https://"; : "http://";;
>  tce.setMapOutputServerAddress(StringInterner.weakIntern(scheme
>  + attempt.getNodeHttpAddress().split(":")[0] + ":"
>  + attempt.getShufflePort()));
>  tce.setStatus(status);
>  tce.setAttemptId(attempt.getID());
>  int runTime = 0;
>  if (attempt.getFinishTime() != 0 && attempt.getLaunchTime() !=0)
>  runTime = (int)(attempt.getFinishTime() - attempt.getLaunchTime());
>  tce.setAttemptRunTime(runTime);
> //raise the event to job so that it adds the completion event to its
>  //data structures
>  eventHandler.handle(new JobTaskAttemptCompletedEvent(tce));
>  } else {
>  TaskAttemptCompletionEvent tce = recordFactory
>  .newRecordInstance(TaskAttemptCompletionEvent.class);
>  tce.setEventId(-1);
>  tce.setStatus(status);
>  tce.setAttemptId(attempt.getID());
>  eventHandler.handle(new JobTaskAttemptCompletedEvent(tce));
>  }



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to