Fengdong Yu created HIVE-4891: --------------------------------- Summary: Distinct includes duplicate records Key: HIVE-4891 URL: https://issues.apache.org/jira/browse/HIVE-4891 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.10.0 Reporter: Fengdong Yu
I have two partitions, one is sequence file, another is RCFile, but they are the same data(only different file format). I have the following SQL: {code} select distinct uid from pv where (dt ='20130718' or dt ='20130718_1') and cur_url like '%cq.aa.com%'; {code} dt ='20130718' is sequence file,(default input format, which specified when create table) dt ='20130718_1' is RCFile. ALTER TABLE test PARTITION(dt='20130718_1') SET FILEFORMAT RCFILE; but there are duplicate recoreds in the result. If two partitions with the same input format, then there are no duplicate records. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira