[ https://issues.apache.org/jira/browse/HIVE-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13607830#comment-13607830 ]
Namit Jain commented on HIVE-4206: ---------------------------------- set hive.optimize.bucketmapjoin = true; set hive.optimize.bucketmapjoin.sortedmerge = true; set hive.input.format = org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; set hive.enforce.bucketing=true; set hive.enforce.sorting=true; set hive.exec.reducers.max = 1; set hive.merge.mapfiles=false; set hive.merge.mapredfiles=false; -- Create bucketed and sorted tables CREATE TABLE test_table1 (key INT, value STRING) CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS; CREATE TABLE test_table2 (key INT, value STRING) CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS; CREATE TABLE test_table3 (key INT, value STRING) CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS; CREATE TABLE test_table4 (key INT, value STRING) CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS; CREATE TABLE test_table5 (key INT, value STRING) CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS; CREATE TABLE test_table6 (key INT, value STRING) CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS; CREATE TABLE test_table7 (key INT, value STRING) CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS; FROM src INSERT OVERWRITE TABLE test_table1 SELECT *; FROM src INSERT OVERWRITE TABLE test_table2 SELECT *; FROM src INSERT OVERWRITE TABLE test_table3 SELECT *; FROM src INSERT OVERWRITE TABLE test_table4 SELECT *; FROM src INSERT OVERWRITE TABLE test_table5 SELECT *; FROM src INSERT OVERWRITE TABLE test_table6 SELECT *; FROM src INSERT OVERWRITE TABLE test_table7 SELECT *; -- Mapjoin followed by a aggregation should be performed in a single MR job EXPLAIN SELECT /*+mapjoin(b)*/ count(*) FROM test_table1 a JOIN test_table2 b ON a.key = b.key JOIN test_table3 c ON a.key = c.key JOIN test_table4 d ON a.key = d.key JOIN test_table5 e ON a.key = e.key JOIN test_table6 f ON a.key = f.key JOIN test_table6 g ON a.key = g.key; The above query does not use sort-merge join. > Sort merge join does not work for more than 7 inputs > ---------------------------------------------------- > > Key: HIVE-4206 > URL: https://issues.apache.org/jira/browse/HIVE-4206 > Project: Hive > Issue Type: Improvement > Components: Query Processor > Reporter: Namit Jain > Assignee: Namit Jain > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira