[jira] [Updated] (TEZ-2855) NPE while routing events
[ https://issues.apache.org/jira/browse/TEZ-2855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated TEZ-2855: Attachment: TEZ-2855.3.txt Thanks for the reviews. Updated the patch to send one more event after moving into the INITED state. Committing in a bit, will post patches for 0.7 and 0.6 in a while as well. > NPE while routing events > > > Key: TEZ-2855 > URL: https://issues.apache.org/jira/browse/TEZ-2855 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Siddharth Seth >Assignee: Siddharth Seth >Priority: Critical > Attachments: 2855log.gz, TEZ-2855.1.txt, TEZ-2855.2.txt, > TEZ-2855.3.txt > > > Observed while running against 0.8.0-alpha. This will likely affect 0.7 as > well - that'll be known after debugging. > {code} > 2015-09-24T12:13:42,675 ERROR [Dispatcher thread: Central] > common.AsyncDispatcher: Error in dispatcher thread > java.lang.NullPointerException > at > org.apache.tez.dag.app.dag.impl.VertexImpl.handleRoutedTezEvents(VertexImpl.java:4429) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.tez.dag.app.dag.impl.VertexImpl.access$4000(VertexImpl.java:203) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.tez.dag.app.dag.impl.VertexImpl$RouteEventTransition.transition(VertexImpl.java:4175) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.tez.dag.app.dag.impl.VertexImpl$RouteEventTransition.transition(VertexImpl.java:4167) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > ~[hadoop-yarn-common-2.6.0.jar:?] > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > ~[hadoop-yarn-common-2.6.0.jar:?] > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > ~[hadoop-yarn-common-2.6.0.jar:?] > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > ~[hadoop-yarn-common-2.6.0.jar:?] > at > org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:57) > ~[TezAppJar.jar:0.8.0-alpha] > at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1906) > ~[TezAppJar.jar:0.8.0-alpha] > at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:202) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2069) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2055) > ~[TezAppJar.jar:0.8.0-alpha] > at org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > [tez-common-0.8.0-alpha.jar:0.8.0-alpha] > at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114) > [tez-common-0.8.0-alpha.jar:0.8.0-alpha] > at java.lang.Thread.run(Thread.java:745) [?:1.8.0_40] > 2015-09-24T12:13:42,681 INFO [HistoryEventHandlingThread] > impl.SimpleHistoryLoggingService: Writing event TASK_ATTEMPT_FINISHED to > history file > {code} > Looks like the VertexManager was null. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2855) NPE while routing events
[ https://issues.apache.org/jira/browse/TEZ-2855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated TEZ-2855: Attachment: TEZ-2855.2.txt > NPE while routing events > > > Key: TEZ-2855 > URL: https://issues.apache.org/jira/browse/TEZ-2855 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Siddharth Seth >Assignee: Siddharth Seth >Priority: Critical > Attachments: 2855log.gz, TEZ-2855.1.txt, TEZ-2855.2.txt > > > Observed while running against 0.8.0-alpha. This will likely affect 0.7 as > well - that'll be known after debugging. > {code} > 2015-09-24T12:13:42,675 ERROR [Dispatcher thread: Central] > common.AsyncDispatcher: Error in dispatcher thread > java.lang.NullPointerException > at > org.apache.tez.dag.app.dag.impl.VertexImpl.handleRoutedTezEvents(VertexImpl.java:4429) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.tez.dag.app.dag.impl.VertexImpl.access$4000(VertexImpl.java:203) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.tez.dag.app.dag.impl.VertexImpl$RouteEventTransition.transition(VertexImpl.java:4175) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.tez.dag.app.dag.impl.VertexImpl$RouteEventTransition.transition(VertexImpl.java:4167) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > ~[hadoop-yarn-common-2.6.0.jar:?] > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > ~[hadoop-yarn-common-2.6.0.jar:?] > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > ~[hadoop-yarn-common-2.6.0.jar:?] > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > ~[hadoop-yarn-common-2.6.0.jar:?] > at > org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:57) > ~[TezAppJar.jar:0.8.0-alpha] > at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1906) > ~[TezAppJar.jar:0.8.0-alpha] > at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:202) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2069) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2055) > ~[TezAppJar.jar:0.8.0-alpha] > at org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > [tez-common-0.8.0-alpha.jar:0.8.0-alpha] > at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114) > [tez-common-0.8.0-alpha.jar:0.8.0-alpha] > at java.lang.Thread.run(Thread.java:745) [?:1.8.0_40] > 2015-09-24T12:13:42,681 INFO [HistoryEventHandlingThread] > impl.SimpleHistoryLoggingService: Writing event TASK_ATTEMPT_FINISHED to > history file > {code} > Looks like the VertexManager was null. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2855) NPE while routing events
[ https://issues.apache.org/jira/browse/TEZ-2855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated TEZ-2855: Attachment: TEZ-2855.1.txt Patch for master to fix the VM NPE. On the logging changes - that's a bigger problem since we aren't handling RuntimeExceptions - created TEZ-2862 to track this. For exceptions we do handle - the vertex name and id is already logged. [~bikassaha], [~hitesh], [~zjffdu] - please review. > NPE while routing events > > > Key: TEZ-2855 > URL: https://issues.apache.org/jira/browse/TEZ-2855 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Siddharth Seth >Assignee: Siddharth Seth >Priority: Critical > Attachments: 2855log.gz, TEZ-2855.1.txt > > > Observed while running against 0.8.0-alpha. This will likely affect 0.7 as > well - that'll be known after debugging. > {code} > 2015-09-24T12:13:42,675 ERROR [Dispatcher thread: Central] > common.AsyncDispatcher: Error in dispatcher thread > java.lang.NullPointerException > at > org.apache.tez.dag.app.dag.impl.VertexImpl.handleRoutedTezEvents(VertexImpl.java:4429) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.tez.dag.app.dag.impl.VertexImpl.access$4000(VertexImpl.java:203) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.tez.dag.app.dag.impl.VertexImpl$RouteEventTransition.transition(VertexImpl.java:4175) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.tez.dag.app.dag.impl.VertexImpl$RouteEventTransition.transition(VertexImpl.java:4167) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > ~[hadoop-yarn-common-2.6.0.jar:?] > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > ~[hadoop-yarn-common-2.6.0.jar:?] > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > ~[hadoop-yarn-common-2.6.0.jar:?] > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > ~[hadoop-yarn-common-2.6.0.jar:?] > at > org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:57) > ~[TezAppJar.jar:0.8.0-alpha] > at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1906) > ~[TezAppJar.jar:0.8.0-alpha] > at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:202) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2069) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2055) > ~[TezAppJar.jar:0.8.0-alpha] > at org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > [tez-common-0.8.0-alpha.jar:0.8.0-alpha] > at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114) > [tez-common-0.8.0-alpha.jar:0.8.0-alpha] > at java.lang.Thread.run(Thread.java:745) [?:1.8.0_40] > 2015-09-24T12:13:42,681 INFO [HistoryEventHandlingThread] > impl.SimpleHistoryLoggingService: Writing event TASK_ATTEMPT_FINISHED to > history file > {code} > Looks like the VertexManager was null. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2855) NPE while routing events
[ https://issues.apache.org/jira/browse/TEZ-2855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated TEZ-2855: Assignee: Siddharth Seth Affects Version/s: (was: 0.8.0-alpha) 0.5.0 Target Version/s: 0.7.1, 0.6.3, 0.8.1 (was: 0.8.1) This goes all the way back to 0.5. If a Vertex initialization is delayed - likely due to a large number of upstream vertices, and a task from a started vertex finishes very fast which generates an event for the uninitialized vertex - we try handling the event before the VM is setup. InputInitializerEvents are not affected - since these events are cached while a vertex is in state NEW. This was hit running LLAP unit tests - were task assignment and execution can be faster. The faster assignment and execution allows for the condition to be hit. It is possible to hit this in regular jobs as well - but less likely since there's generally a delay in a container getting work. Hitting it in local mode is possible though. Targeting the fix up to 0.6. > NPE while routing events > > > Key: TEZ-2855 > URL: https://issues.apache.org/jira/browse/TEZ-2855 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Siddharth Seth >Assignee: Siddharth Seth >Priority: Critical > Attachments: 2855log.gz > > > Observed while running against 0.8.0-alpha. This will likely affect 0.7 as > well - that'll be known after debugging. > {code} > 2015-09-24T12:13:42,675 ERROR [Dispatcher thread: Central] > common.AsyncDispatcher: Error in dispatcher thread > java.lang.NullPointerException > at > org.apache.tez.dag.app.dag.impl.VertexImpl.handleRoutedTezEvents(VertexImpl.java:4429) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.tez.dag.app.dag.impl.VertexImpl.access$4000(VertexImpl.java:203) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.tez.dag.app.dag.impl.VertexImpl$RouteEventTransition.transition(VertexImpl.java:4175) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.tez.dag.app.dag.impl.VertexImpl$RouteEventTransition.transition(VertexImpl.java:4167) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > ~[hadoop-yarn-common-2.6.0.jar:?] > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > ~[hadoop-yarn-common-2.6.0.jar:?] > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > ~[hadoop-yarn-common-2.6.0.jar:?] > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > ~[hadoop-yarn-common-2.6.0.jar:?] > at > org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:57) > ~[TezAppJar.jar:0.8.0-alpha] > at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1906) > ~[TezAppJar.jar:0.8.0-alpha] > at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:202) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2069) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2055) > ~[TezAppJar.jar:0.8.0-alpha] > at org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > [tez-common-0.8.0-alpha.jar:0.8.0-alpha] > at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114) > [tez-common-0.8.0-alpha.jar:0.8.0-alpha] > at java.lang.Thread.run(Thread.java:745) [?:1.8.0_40] > 2015-09-24T12:13:42,681 INFO [HistoryEventHandlingThread] > impl.SimpleHistoryLoggingService: Writing event TASK_ATTEMPT_FINISHED to > history file > {code} > Looks like the VertexManager was null. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2855) NPE while routing events
[ https://issues.apache.org/jira/browse/TEZ-2855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated TEZ-2855: Attachment: 2855log.gz Logs. > NPE while routing events > > > Key: TEZ-2855 > URL: https://issues.apache.org/jira/browse/TEZ-2855 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.8.0-alpha >Reporter: Siddharth Seth >Priority: Critical > Attachments: 2855log.gz > > > Observed while running against 0.8.0-alpha. This will likely affect 0.7 as > well - that'll be known after debugging. > {code} > 2015-09-24T12:13:42,675 ERROR [Dispatcher thread: Central] > common.AsyncDispatcher: Error in dispatcher thread > java.lang.NullPointerException > at > org.apache.tez.dag.app.dag.impl.VertexImpl.handleRoutedTezEvents(VertexImpl.java:4429) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.tez.dag.app.dag.impl.VertexImpl.access$4000(VertexImpl.java:203) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.tez.dag.app.dag.impl.VertexImpl$RouteEventTransition.transition(VertexImpl.java:4175) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.tez.dag.app.dag.impl.VertexImpl$RouteEventTransition.transition(VertexImpl.java:4167) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > ~[hadoop-yarn-common-2.6.0.jar:?] > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > ~[hadoop-yarn-common-2.6.0.jar:?] > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > ~[hadoop-yarn-common-2.6.0.jar:?] > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > ~[hadoop-yarn-common-2.6.0.jar:?] > at > org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:57) > ~[TezAppJar.jar:0.8.0-alpha] > at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1906) > ~[TezAppJar.jar:0.8.0-alpha] > at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:202) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2069) > ~[TezAppJar.jar:0.8.0-alpha] > at > org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2055) > ~[TezAppJar.jar:0.8.0-alpha] > at org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > [tez-common-0.8.0-alpha.jar:0.8.0-alpha] > at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114) > [tez-common-0.8.0-alpha.jar:0.8.0-alpha] > at java.lang.Thread.run(Thread.java:745) [?:1.8.0_40] > 2015-09-24T12:13:42,681 INFO [HistoryEventHandlingThread] > impl.SimpleHistoryLoggingService: Writing event TASK_ATTEMPT_FINISHED to > history file > {code} > Looks like the VertexManager was null. -- This message was sent by Atlassian JIRA (v6.3.4#6332)