[ 
https://issues.apache.org/jira/browse/PIG-3849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated PIG-3849:
----------------------------
    Description: 
  This can be done in one vertex with multiple inputs instead of having an 
extra vertex to do the join. i.e Currently Vertex 1 (load relation1) -> Vertex 
2 (group by) -> Vertex 4 (join) <- Vertex 3 (load relation 2). This could be 
changed to Vertex 1 (load relation1) -> Vertex 2 (group by and join) <- Vertex 
3 (load relation 2)

And this could be extended into a more general way to do query correlation 
optimization.  


  was:  This can be done in one vertex with multiple inputs instead of having 
an extra vertex to do the join. i.e Currently Vertex 1 (load relation1) -> 
Vertex 2 (group by) -> Vertex 4 (join) <- Vertex 3 (load relation 2). This 
could be changed to Vertex 1 (load relation1) -> Vertex 2 (group by and join) 
<- Vertex 3 (load relation 2)


> Optimize group by followed by join on the same key
> --------------------------------------------------
>
>                 Key: PIG-3849
>                 URL: https://issues.apache.org/jira/browse/PIG-3849
>             Project: Pig
>          Issue Type: Sub-task
>          Components: tez
>            Reporter: Rohini Palaniswamy
>
>   This can be done in one vertex with multiple inputs instead of having an 
> extra vertex to do the join. i.e Currently Vertex 1 (load relation1) -> 
> Vertex 2 (group by) -> Vertex 4 (join) <- Vertex 3 (load relation 2). This 
> could be changed to Vertex 1 (load relation1) -> Vertex 2 (group by and join) 
> <- Vertex 3 (load relation 2)
> And this could be extended into a more general way to do query correlation 
> optimization.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to