[ https://issues.apache.org/jira/browse/PIG-3849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jeff Zhang updated PIG-3849: ---------------------------- Description: This can be done in one vertex with multiple inputs instead of having an extra vertex to do the join. i.e Currently Vertex 1 (load relation1) -> Vertex 2 (group by) -> Vertex 4 (join) <- Vertex 3 (load relation 2). This could be changed to Vertex 1 (load relation1) -> Vertex 2 (group by and join) <- Vertex 3 (load relation 2) And this could be extended into a more general way to do query correlation optimization. was: This can be done in one vertex with multiple inputs instead of having an extra vertex to do the join. i.e Currently Vertex 1 (load relation1) -> Vertex 2 (group by) -> Vertex 4 (join) <- Vertex 3 (load relation 2). This could be changed to Vertex 1 (load relation1) -> Vertex 2 (group by and join) <- Vertex 3 (load relation 2) > Optimize group by followed by join on the same key > -------------------------------------------------- > > Key: PIG-3849 > URL: https://issues.apache.org/jira/browse/PIG-3849 > Project: Pig > Issue Type: Sub-task > Components: tez > Reporter: Rohini Palaniswamy > > This can be done in one vertex with multiple inputs instead of having an > extra vertex to do the join. i.e Currently Vertex 1 (load relation1) -> > Vertex 2 (group by) -> Vertex 4 (join) <- Vertex 3 (load relation 2). This > could be changed to Vertex 1 (load relation1) -> Vertex 2 (group by and join) > <- Vertex 3 (load relation 2) > And this could be extended into a more general way to do query correlation > optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)