[
https://issues.apache.org/jira/browse/PIG-3849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jeff Zhang updated PIG-3849:
----------------------------
Description:
This can be done in one vertex with multiple inputs instead of having an
extra vertex to do the join. i.e Currently Vertex 1 (load relation1) -> Vertex
2 (group by) -> Vertex 4 (join) <- Vertex 3 (load relation 2). This could be
changed to Vertex 1 (load relation1) -> Vertex 2 (group by and join) <- Vertex
3 (load relation 2)
And this could be extended into a more general way to do query correlation
optimization.
was: This can be done in one vertex with multiple inputs instead of having
an extra vertex to do the join. i.e Currently Vertex 1 (load relation1) ->
Vertex 2 (group by) -> Vertex 4 (join) <- Vertex 3 (load relation 2). This
could be changed to Vertex 1 (load relation1) -> Vertex 2 (group by and join)
<- Vertex 3 (load relation 2)
> Optimize group by followed by join on the same key
> --------------------------------------------------
>
> Key: PIG-3849
> URL: https://issues.apache.org/jira/browse/PIG-3849
> Project: Pig
> Issue Type: Sub-task
> Components: tez
> Reporter: Rohini Palaniswamy
>
> This can be done in one vertex with multiple inputs instead of having an
> extra vertex to do the join. i.e Currently Vertex 1 (load relation1) ->
> Vertex 2 (group by) -> Vertex 4 (join) <- Vertex 3 (load relation 2). This
> could be changed to Vertex 1 (load relation1) -> Vertex 2 (group by and join)
> <- Vertex 3 (load relation 2)
> And this could be extended into a more general way to do query correlation
> optimization.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)