[ https://issues.apache.org/jira/browse/HIVE-10996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593595#comment-14593595 ]
Hive QA commented on HIVE-10996: -------------------------------- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12740656/HIVE-10996.05.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9011 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_having org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join28 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4321/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4321/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4321/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12740656 - PreCommit-HIVE-TRUNK-Build > Aggregation / Projection over Multi-Join Inner Query producing incorrect > results > -------------------------------------------------------------------------------- > > Key: HIVE-10996 > URL: https://issues.apache.org/jira/browse/HIVE-10996 > Project: Hive > Issue Type: Bug > Components: Hive > Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0 > Reporter: Gautam Kowshik > Assignee: Jesus Camacho Rodriguez > Priority: Critical > Attachments: HIVE-10996.01.patch, HIVE-10996.02.patch, > HIVE-10996.03.patch, HIVE-10996.04.patch, HIVE-10996.05.patch, > HIVE-10996.patch, explain_q1.txt, explain_q2.txt > > > We see the following problem on 1.1.0 and 1.2.0 but not 0.13 which seems like > a regression. > The following query (Q1) produces no results: > {code} > select s > from ( > select last.*, action.st2, action.n > from ( > select purchase.s, purchase.timestamp, max (mevt.timestamp) as > last_stage_timestamp > from (select * from purchase_history) purchase > join (select * from cart_history) mevt > on purchase.s = mevt.s > where purchase.timestamp > mevt.timestamp > group by purchase.s, purchase.timestamp > ) last > join (select * from events) action > on last.s = action.s and last.last_stage_timestamp = action.timestamp > ) list; > {code} > While this one (Q2) does produce results : > {code} > select * > from ( > select last.*, action.st2, action.n > from ( > select purchase.s, purchase.timestamp, max (mevt.timestamp) as > last_stage_timestamp > from (select * from purchase_history) purchase > join (select * from cart_history) mevt > on purchase.s = mevt.s > where purchase.timestamp > mevt.timestamp > group by purchase.s, purchase.timestamp > ) last > join (select * from events) action > on last.s = action.s and last.last_stage_timestamp = action.timestamp > ) list; > 1 21 20 Bob 1234 > 1 31 30 Bob 1234 > 3 51 50 Jeff 1234 > {code} > The setup to test this is: > {code} > create table purchase_history (s string, product string, price double, > timestamp int); > insert into purchase_history values ('1', 'Belt', 20.00, 21); > insert into purchase_history values ('1', 'Socks', 3.50, 31); > insert into purchase_history values ('3', 'Belt', 20.00, 51); > insert into purchase_history values ('4', 'Shirt', 15.50, 59); > create table cart_history (s string, cart_id int, timestamp int); > insert into cart_history values ('1', 1, 10); > insert into cart_history values ('1', 2, 20); > insert into cart_history values ('1', 3, 30); > insert into cart_history values ('1', 4, 40); > insert into cart_history values ('3', 5, 50); > insert into cart_history values ('4', 6, 60); > create table events (s string, st2 string, n int, timestamp int); > insert into events values ('1', 'Bob', 1234, 20); > insert into events values ('1', 'Bob', 1234, 30); > insert into events values ('1', 'Bob', 1234, 25); > insert into events values ('2', 'Sam', 1234, 30); > insert into events values ('3', 'Jeff', 1234, 50); > insert into events values ('4', 'Ted', 1234, 60); > {code} > I realize select * and select s are not all that interesting in this context > but what lead us to this issue was select count(distinct s) was not returning > results. The above queries are the simplified queries that produce the issue. > I will note that if I convert the inner join to a table and select from that > the issue does not appear. > Update: Found that turning off hive.optimize.remove.identity.project fixes > this issue. This optimization was introduced in > https://issues.apache.org/jira/browse/HIVE-8435 -- This message was sent by Atlassian JIRA (v6.3.4#6332)