[ 
https://issues.apache.org/jira/browse/TEZ-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15062896#comment-15062896
 ] 

Jason Lowe commented on TEZ-3009:
---------------------------------

Sample container log showing the problem:
{noformat}
2015-12-11 18:53:23,832 [INFO] [TezChild] |task.ContainerReporter|: Attempting 
to fetch new task for container container_e06_1449209941524_271349_01_002271
2015-12-11 18:53:23,879 [INFO] [main] |task.TezChild|: Shutdown invoked for 
container container_e06_1449209941524_271349_01_002271
2015-12-11 18:53:23,880 [INFO] [main] |task.TezChild|: Shutting down container 
container_e06_1449209941524_271349_01_002271
{noformat}

There's straightline code between the "Attempting to fetch new task ..." log 
message and a later log of "Got TaskUpdate for ...". However we don't see the 
latter log message, so something threw an exception. Unfortunately the code 
that catches the exception squirrels it into a return result that is 
subsequently ignored by the main code without logging it.

> Errors that occur during container task acquisition are not logged
> ------------------------------------------------------------------
>
>                 Key: TEZ-3009
>                 URL: https://issues.apache.org/jira/browse/TEZ-3009
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>            Reporter: Jason Lowe
>
> If TezChild encounters an error while trying to obtain a task the error will 
> be silently handled.  This results in a mysterious shutdown of containers 
> with no cause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to