Suppose we have 2 simple tables A id int value string
B id When hive translates the following query select max(A.value), A.id from A join B on A.id = A.id group by A.id; It launches 2 stages, one for the join and one for the group by. My understanding is that if the join key set is a sub set of the group by key set, it can be achieved in the same map reduce job. If that is correct in theory, could it be a feature in hive? Chen