[ 
https://issues.apache.org/jira/browse/PIG-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14484292#comment-14484292
 ] 

Daniel Dai commented on PIG-4495:
---------------------------------

+1

> Better multi-query planning in case of multiple edges
> -----------------------------------------------------
>
>                 Key: PIG-4495
>                 URL: https://issues.apache.org/jira/browse/PIG-4495
>             Project: Pig
>          Issue Type: Sub-task
>          Components: tez
>    Affects Versions: 0.14.0
>            Reporter: Rohini Palaniswamy
>            Assignee: Rohini Palaniswamy
>             Fix For: 0.15.0
>
>         Attachments: PIG-4495-1.patch, PIG-4495-2.patch
>
>
> Details in 
> https://issues.apache.org/jira/browse/TEZ-1190?focusedCommentId=14393033&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14393033
> People split the data, perform some foreach transformations/filter, union 
> them and then do some operation like group by or join with other data. In 
> those cases it creates multiple edges from same Split, so we do not merge 
> them, but  
> write out the data to another dummy vertex to avoid multiple edges and this 
> adds overhead and affects performance. Vertex groups accept multiple edges 
> from same vertex. So if the multiple edges end up in a vertex group (and not 
> a vertex which is the case in self join) we can avoid the dummy vertex.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to