Re: [jira] [Commented] (TEZ-917) NPE when executing running via a custom edge

Siddharth Seth Mon, 10 Mar 2014 19:37:19 -0700

Minimal details - Vikram / Gunther should be able to provide more.
At the moment Hive is using this to implement Bucketed Map Joins, where
one side of the join does not need to be pre-bucketed.


A simple 2 table example:
Table 1 is pre-bucketed.
Table 2 is not - so it will be bucketed dynamically during execution.

Table 1 determines the number of tasks, and the distribution of work to
individual tasks. A single bucket may span multiple tasks. Depending on
the task distribution, buckets generated by Table2 are routed to the
correct set of tasks (belonging to the appropriate bucket). Custom
Edge/VertexManagers are used since this isn¹t a standard routing pattern.

Thanks
- Sid


On 3/7/14, 11:41 PM, "Rohini Palaniswamy" <[email protected]> wrote:

>Hi,
>   Could you guys tell us what is the hive team using custom edges for?
>
>Regards,
>Rohini
>
>---------- Forwarded message ----------
>From: Siddharth Seth (JIRA) <[email protected]>
>Date: Thu, Mar 6, 2014 at 10:24 AM
>Subject: [jira] [Commented] (TEZ-917) NPE when executing running via a
>custom edge
>To: [email protected]
>
>
>
>    [
>https://issues.apache.org/jira/browse/TEZ-917?page=com.atlassian.jira.plug
>in.system.issuetabpanels:comment-tabpanel&focusedCommentId=13922841#commen
>t-13922841]
>
>Siddharth Seth commented on TEZ-917:
>------------------------------------
>
>Scratch that. Looking at the trace, this is likely a race in the Hive
>custom edge plugin..
>
>> NPE when executing running via a custom edge
>> --------------------------------------------
>>
>>                 Key: TEZ-917
>>                 URL: https://issues.apache.org/jira/browse/TEZ-917
>>             Project: Apache Tez
>>          Issue Type: Bug
>>            Reporter: Siddharth Seth
>>
>> Reported by [~vikram.dixit]. Likely a race in event routing.
>> {code}
>> java.lang.NullPointerException
>>   at
>org.apache.hadoop.hive.ql.exec.tez.CustomPartitionEdge.getNumSourceTaskPhy
>sicalOutputs(CustomPartitionEdge.java:55)
>>   at org.apache.tez.dag.app.dag.impl.Edge.getSourceSpec(Edge.java:183)
>>   at
>org.apache.tez.dag.app.dag.impl.VertexImpl.getOutputSpecList(VertexImpl.ja
>va:2371)
>>   at
>org.apache.tez.dag.app.dag.impl.TaskAttemptImpl.createRemoteTaskSpec(TaskA
>ttemptImpl.java:518)
>>   at
>org.apache.tez.dag.app.dag.impl.TaskAttemptImpl$ScheduleTaskattemptTransit
>ion.transition(TaskAttemptImpl.java:1038)
>>   at
>org.apache.tez.dag.app.dag.impl.TaskAttemptImpl$ScheduleTaskattemptTransit
>ion.transition(TaskAttemptImpl.java:1027)
>>   at
>org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTrans
>ition(StateMachineFactory.java:362)
>>   at
>org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachine
>Factory.java:302)
>>   at
>org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFa
>ctory.java:46)
>>   at
>org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTr
>ansition(StateMachineFactory.java:448)
>>   at
>org.apache.tez.dag.app.dag.impl.TaskAttemptImpl.handle(TaskAttemptImpl.jav
>a:721)
>>   at
>org.apache.tez.dag.app.dag.impl.TaskAttemptImpl.handle(TaskAttemptImpl.jav
>a:105)
>>   at
>org.apache.tez.dag.app.DAGAppMaster$TaskAttemptEventDispatcher.handle(DAGA
>ppMaster.java:1432)
>>   at
>org.apache.tez.dag.app.DAGAppMaster$TaskAttemptEventDispatcher.handle(DAGA
>ppMaster.java:1417)
>>   at
>org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java
>:173)
>>   at
>org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:10
>6)
>>   at java.lang.Thread.run(Thread.java:695)
>> 2014-03-03 14:55:56,519 INFO [AsyncDispatcher event handler]
>org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye
>> {code}
>
>
>
>--
>This message was sent by Atlassian JIRA
>(v6.2#6252)

Re: [jira] [Commented] (TEZ-917) NPE when executing running via a custom edge

Reply via email to