[jira] [Commented] (HIVE-15685) count(distinct) generates different result than expected
[ https://issues.apache.org/jira/browse/HIVE-15685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15833329#comment-15833329 ] Lefty Leverenz commented on HIVE-15685: --- That was quick! Thanks. > count(distinct) generates different result than expected > > > Key: HIVE-15685 > URL: https://issues.apache.org/jira/browse/HIVE-15685 > Project: Hive > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 2.2.0 > > Attachments: HIVE-15685.01.patch > > > Following query with count(distinct) generates different result than expected > on hive master: > {noformat} > select count(distinct ss_ticket_number), count(distinct ss_sold_date_sk) from > store_sales; > {noformat} > Expected output generated using postgres: > {noformat} > select count(distinct ss_ticket_number), count(distinct ss_sold_date_sk) from > store_sales; > count | count > +--- > 24 | 1823 > (1 row) > {noformat} > Actual output > {noformat} > 241824 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15685) count(distinct) generates different result than expected
[ https://issues.apache.org/jira/browse/HIVE-15685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15833328#comment-15833328 ] Pengcheng Xiong commented on HIVE-15685: pushed to master, thanks [~ashutoshc] for the review. > count(distinct) generates different result than expected > > > Key: HIVE-15685 > URL: https://issues.apache.org/jira/browse/HIVE-15685 > Project: Hive > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 2.2.0 > > Attachments: HIVE-15685.01.patch > > > Following query with count(distinct) generates different result than expected > on hive master: > {noformat} > select count(distinct ss_ticket_number), count(distinct ss_sold_date_sk) from > store_sales; > {noformat} > Expected output generated using postgres: > {noformat} > select count(distinct ss_ticket_number), count(distinct ss_sold_date_sk) from > store_sales; > count | count > +--- > 24 | 1823 > (1 row) > {noformat} > Actual output > {noformat} > 241824 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15685) count(distinct) generates different result than expected
[ https://issues.apache.org/jira/browse/HIVE-15685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15833326#comment-15833326 ] Lefty Leverenz commented on HIVE-15685: --- [~pxiong] please update the status on this issue, now that you've committed it. > count(distinct) generates different result than expected > > > Key: HIVE-15685 > URL: https://issues.apache.org/jira/browse/HIVE-15685 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15685.01.patch > > > Following query with count(distinct) generates different result than expected > on hive master: > {noformat} > select count(distinct ss_ticket_number), count(distinct ss_sold_date_sk) from > store_sales; > {noformat} > Expected output generated using postgres: > {noformat} > select count(distinct ss_ticket_number), count(distinct ss_sold_date_sk) from > store_sales; > count | count > +--- > 24 | 1823 > (1 row) > {noformat} > Actual output > {noformat} > 241824 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15685) count(distinct) generates different result than expected
[ https://issues.apache.org/jira/browse/HIVE-15685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832987#comment-15832987 ] Hive QA commented on HIVE-15685: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12848637/HIVE-15685.01.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10961 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[cascade_dbdrop] (batchId=226) org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[generatehfiles_require_family_path] (batchId=226) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a] (batchId=136) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[limit_pushdown3] (batchId=144) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=140) org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver (batchId=119) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3097/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3097/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3097/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12848637 - PreCommit-HIVE-Build > count(distinct) generates different result than expected > > > Key: HIVE-15685 > URL: https://issues.apache.org/jira/browse/HIVE-15685 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15685.01.patch > > > Following query with count(distinct) generates different result than expected > on hive master: > {noformat} > select count(distinct ss_ticket_number), count(distinct ss_sold_date_sk) from > store_sales; > {noformat} > Expected output generated using postgres: > {noformat} > select count(distinct ss_ticket_number), count(distinct ss_sold_date_sk) from > store_sales; > count | count > +--- > 24 | 1823 > (1 row) > {noformat} > Actual output > {noformat} > 241824 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15685) count(distinct) generates different result than expected
[ https://issues.apache.org/jira/browse/HIVE-15685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832613#comment-15832613 ] Ashutosh Chauhan commented on HIVE-15685: - +1 > count(distinct) generates different result than expected > > > Key: HIVE-15685 > URL: https://issues.apache.org/jira/browse/HIVE-15685 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15685.01.patch > > > Following query with count(distinct) generates different result than expected > on hive master: > {noformat} > select count(distinct ss_ticket_number), count(distinct ss_sold_date_sk) from > store_sales; > {noformat} > Expected output generated using postgres: > {noformat} > select count(distinct ss_ticket_number), count(distinct ss_sold_date_sk) from > store_sales; > count | count > +--- > 24 | 1823 > (1 row) > {noformat} > Actual output > {noformat} > 241824 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15685) count(distinct) generates different result than expected
[ https://issues.apache.org/jira/browse/HIVE-15685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15832506#comment-15832506 ] Pengcheng Xiong commented on HIVE-15685: [~ashutoshc], could u take a look? Thanks. > count(distinct) generates different result than expected > > > Key: HIVE-15685 > URL: https://issues.apache.org/jira/browse/HIVE-15685 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15685.01.patch > > > Following query with count(distinct) generates different result than expected > on hive master: > {noformat} > select count(distinct ss_ticket_number), count(distinct ss_sold_date_sk) from > store_sales; > {noformat} > Expected output generated using postgres: > {noformat} > select count(distinct ss_ticket_number), count(distinct ss_sold_date_sk) from > store_sales; > count | count > +--- > 24 | 1823 > (1 row) > {noformat} > Actual output > {noformat} > 241824 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)