[
https://issues.apache.org/jira/browse/TEZ-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14057021#comment-14057021
]
Bikas Saha edited comment on TEZ-657 at 7/10/14 1:53 AM:
---------------------------------------------------------
TaskSchedulerEventHandler sets the appropriate info in
AMContainerEventCompleted.
When AMContainer gets the event then it checks for preempted and diskfailed and
sends appropriate events to TaskAttempt.
Renamed TaskAttemptEventType.PREEMPTED to TERMINATED_BY_SYSTEM for generic
system terminations instead of duplicating PREEMPTION and DISK_FAILED.
TaskAttemptImpl already handles the preemption for our internal preemption. So
no need to make changes for that. Now external preemption and disk failures are
also handled similarly.
If needed the actual status can be passed via the event later on.
Added tests.
There is no need to fail nodes on disk failure because the NM will remain
usable with the remaining disks. If too many disks fail, NM will mark itself
unhealthy and we handle that already.
[~sseth] please review.
was (Author: bikassaha):
TaskSchedulerEventHandler sets the appropriate info in
AMContainerEventCompleted.
When AMContainer gets the event then it checks for preempted and diskfailed and
sends appropriate events to TaskAttempt.
Renamed TaskAttemptEventType.PREEMPTED to TERMINATED_BY_SYSTEM for generic
system terminations instead of duplicating PREEMPTION and DISK_FAILED. If
needed the actual status can be passed via the event later on.
Added tests.
There is no need to fail nodes on disk failure because the NM will remain
usable with the remaining disks. If too many disks fail, NM will mark itself
unhealthy and we handle that already.
[~sseth] please review.
> Tez should process the Container exit status - specifically when the RM
> preempts a container
> --------------------------------------------------------------------------------------------
>
> Key: TEZ-657
> URL: https://issues.apache.org/jira/browse/TEZ-657
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Siddharth Seth
> Assignee: Bikas Saha
> Attachments: TEZ-657.1.patch
>
>
> Containers preempted by the RM will currently register as task failures -
> these tasks should be considered to be KILLED instead.
> Handling the entire preemption hint logic would be a separate jira.
--
This message was sent by Atlassian JIRA
(v6.2#6252)