[ 
https://issues.apache.org/jira/browse/HIVE-12577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15056438#comment-15056438
 ] 

Siddharth Seth commented on HIVE-12577:
---------------------------------------

EntityTracker tracks the relationship between containers and tasks, along with 
the nodes they run on. This is used for various bits of accounting - including 
telling unknown fragments to die, processing heartbeats for fragments which are 
in the wait queue of an llap instance.

There were some discrepancies in this tracking, the most important one being 
the null check which causes the exception. The patch fixes these and adds some 
unit tests for verification.

> NPE in LlapTaskCommunicator when unregistering containers
> ---------------------------------------------------------
>
>                 Key: HIVE-12577
>                 URL: https://issues.apache.org/jira/browse/HIVE-12577
>             Project: Hive
>          Issue Type: Bug
>          Components: llap
>    Affects Versions: 2.0.0
>            Reporter: Siddharth Seth
>            Assignee: Siddharth Seth
>            Priority: Critical
>         Attachments: HIVE-12577.1.review.txt, HIVE-12577.1.txt, 
> HIVE-12577.1.wip.txt
>
>
> {code}
> 2015-12-02 13:29:00,160 [ERROR] [Dispatcher thread {Central}] 
> |common.AsyncDispatcher|: Error in dispatcher thread
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskCommunicator$EntityTracker.unregisterContainer(LlapTaskCommunicator.java:586)
>         at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskCommunicator.registerContainerEnd(LlapTaskCommunicator.java:188)
>         at 
> org.apache.tez.dag.app.TaskCommunicatorManager.unregisterRunningContainer(TaskCommunicatorManager.java:389)
>         at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.unregisterFromTAListener(AMContainerImpl.java:1121)
>         at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtLaunchingTransition.transition(AMContainerImpl.java:699)
>         at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtIdleTransition.transition(AMContainerImpl.java:805)
>         at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:892)
>         at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:887)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.handle(AMContainerImpl.java:415)
>         at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.handle(AMContainerImpl.java:72)
>         at 
> org.apache.tez.dag.app.rm.container.AMContainerMap.handle(AMContainerMap.java:60)
>         at 
> org.apache.tez.dag.app.rm.container.AMContainerMap.handle(AMContainerMap.java:36)
>         at 
> org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
>         at 
> org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
>         at java.lang.Thread.run(Thread.java:745)
> 2015-12-02 13:29:00,167 [ERROR] [Dispatcher thread {Central}] 
> |common.AsyncDispatcher|: Error in dispatcher thread
> java.lang.NullPointerException
>         at 
> org.apache.tez.dag.app.TaskCommunicatorManager.unregisterRunningContainer(TaskCommunicatorManager.java:386)
>         at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.unregisterFromTAListener(AMContainerImpl.java:1121)
>         at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtLaunchingTransition.transition(AMContainerImpl.java:699)
>         at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtIdleTransition.transition(AMContainerImpl.java:805)
>         at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:892)
>         at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:887)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.handle(AMContainerImpl.java:415)
>         at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.handle(AMContainerImpl.java:72)
>         at 
> org.apache.tez.dag.app.rm.container.AMContainerMap.handle(AMContainerMap.java:60)
>         at 
> org.apache.tez.dag.app.rm.container.AMContainerMap.handle(AMContainerMap.java:36)
>         at 
> org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
>         at 
> org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
>         at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to