[ https://issues.apache.org/jira/browse/HIVE-4096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Namit Jain updated HIVE-4096: ----------------------------- Status: Patch Available (was: Open) Tests passed > problem in hive.map.groupby.sorted with distincts > ------------------------------------------------- > > Key: HIVE-4096 > URL: https://issues.apache.org/jira/browse/HIVE-4096 > Project: Hive > Issue Type: Bug > Components: Query Processor > Reporter: Namit Jain > Assignee: Namit Jain > Attachments: hive.4096.1.patch > > > set hive.enforce.bucketing = true; > set hive.enforce.sorting = true; > set hive.exec.reducers.max = 10; > set hive.map.groupby.sorted=true; > CREATE TABLE T1(key STRING, val STRING) PARTITIONED BY (ds string) > CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE; > LOAD DATA LOCAL INPATH '../data/files/T1.txt' INTO TABLE T1 PARTITION > (ds='1'); > -- perform an insert to make sure there are 2 files > INSERT OVERWRITE TABLE T1 PARTITION (ds='1') select key, val from T1 where ds > = '1'; > CREATE TABLE outputTbl1(cnt INT); > -- The plan should be converted to a map-side group by, since the > -- sorting columns and grouping columns match, and all the bucketing columns > -- are part of sorting columns > EXPLAIN > select count(distinct key) from T1; > select count(distinct key) from T1; > explain > INSERT OVERWRITE TABLE outputTbl1 > select count(distinct key) from T1; > INSERT OVERWRITE TABLE outputTbl1 > select count(distinct key) from T1; > SELECT * FROM outputTbl1; > DROP TABLE T1; > The above query gives wrong results -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira