[ https://issues.apache.org/jira/browse/TEZ-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15062896#comment-15062896 ]
Jason Lowe commented on TEZ-3009: --------------------------------- Sample container log showing the problem: {noformat} 2015-12-11 18:53:23,832 [INFO] [TezChild] |task.ContainerReporter|: Attempting to fetch new task for container container_e06_1449209941524_271349_01_002271 2015-12-11 18:53:23,879 [INFO] [main] |task.TezChild|: Shutdown invoked for container container_e06_1449209941524_271349_01_002271 2015-12-11 18:53:23,880 [INFO] [main] |task.TezChild|: Shutting down container container_e06_1449209941524_271349_01_002271 {noformat} There's straightline code between the "Attempting to fetch new task ..." log message and a later log of "Got TaskUpdate for ...". However we don't see the latter log message, so something threw an exception. Unfortunately the code that catches the exception squirrels it into a return result that is subsequently ignored by the main code without logging it. > Errors that occur during container task acquisition are not logged > ------------------------------------------------------------------ > > Key: TEZ-3009 > URL: https://issues.apache.org/jira/browse/TEZ-3009 > Project: Apache Tez > Issue Type: Bug > Affects Versions: 0.7.0 > Reporter: Jason Lowe > > If TezChild encounters an error while trying to obtain a task the error will > be silently handled. This results in a mysterious shutdown of containers > with no cause. -- This message was sent by Atlassian JIRA (v6.3.4#6332)