[ 
https://issues.apache.org/jira/browse/TEZ-391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14282355#comment-14282355
 ] 

Jeff Zhang commented on TEZ-391:
--------------------------------

Attach patch for SharedEdge
* Add a new api in Edge to create shared edge
{code}
public Edge createSharedEdge(Vertex outputVertex) 
{code}
* Currently it only support One-to-One and Broadcast (ScatterGather require the 
2 downstream vertices has the same parallelism, otherwise shuffle will break. 
Although I did some change to make the ScatterGather work, but it still need 
more work, especially on the reducer auto-parallelism)
* Add one example in tez-example to show the usage. (SharedEdgeExample)

Although this patch works, after more thinking, I think using VertexGroup may 
be more natural and easy to understand. (We just need to make the 2 downstream 
vertices as a vertex group and connect the upstream vertex with this vertex 
group)  VertexGroup is now used for shared output, it is also natural to make 
it support for shared input. I will attach a new patch by using VertexGroup 
later.




> SharedEdge - Support for passing same output from a vertex as input to two 
> different vertices
> ---------------------------------------------------------------------------------------------
>
>                 Key: TEZ-391
>                 URL: https://issues.apache.org/jira/browse/TEZ-391
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Rohini Palaniswamy
>            Assignee: Jeff Zhang
>         Attachments: TEZ-391-WIP-1.patch
>
>
>   We need this for lot of usecases. For cases where multi-query is turned off 
> and for optimizing unions. Currently those are BROADCAST or ONE-ONE edges and 
> we write the output multiple times.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to