[ https://issues.apache.org/jira/browse/HIVE-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13596690#comment-13596690 ]
Lianhui Wang commented on HIVE-4137: ------------------------------------ in addition. for bucketed/sorted tables, for single group by operator,it only needs map-group by operator and doesnot have reduce-group by operator. example: select key,aggr() from T1 group by key. now plan is TS-SEL-GBY-RS-GBY-SEL-FS but that can chang to following plan TS-SEL-GBY-SEL-FS > optimize group by followed by joins for bucketed/sorted tables > -------------------------------------------------------------- > > Key: HIVE-4137 > URL: https://issues.apache.org/jira/browse/HIVE-4137 > Project: Hive > Issue Type: Improvement > Components: Query Processor > Reporter: Namit Jain > > Consider the following scenario: > create table T1 (...) clustered by (key) sorted by (key) into 2 buckets; > create table T2 (...) clustered by (key) sorted by (key) into 2 buckets; > create table T3 (...) clustered by (key) sorted by (key) into 2 buckets; > SET hive.enforce.sorting=true; > SET hive.enforce.bucketing=true; > insert overwrite table T3 > select .. > from > (select key, aggr() from T1 group by key) s1 > full outer join > (select key, aggr() from T2 group by key) s2 > on s1.key=s2.ley; > Ideally, this query can be performed in a single map-only job. > Group By -> SortMerge Join. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira