Re: Review Request 29878: Bug with max() together with rank() and grouping sets

Navis Ryu Sun, 18 Jan 2015 17:11:37 -0800

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29878/
-----------------------------------------------------------


(Updated Jan. 19, 2015, 1:10 a.m.)


Review request for hive.


Changes
-------

Addressed comments


Bugs: HIVE-9347
    https://issues.apache.org/jira/browse/HIVE-9347


Repository: hive-git


Description
-------

It looks like the query below returns incorrect results on Hive 0.13.1, but it 
was working fine on Hive 0.11. 

I have the following table:
CREATE  TABLE `t`(
  `category` int, 
  `live` int, 
  `comments` int)

with the following data:
hive> select * from t;
OK
3       0       2
2       0       2
8       0       2

The query:
hive> select category, max(live) live, max(comments) comments, rank() OVER 
(PARTITION BY category ORDER BY comments) rank1
FROM t
GROUP BY category
GROUPING SETS ((), (category))
HAVING max(comments) > 0;

return the following results:

NULL    1       48      1
2       1       49      1
3       1       49      1
8       1       49      1

When using grouping sets with the rank() function the max() function return 
incorrect results. Everything works fine if I remove grouping sets clause and 
split the query into two independent queries or remove the rank() function.

This looks like a bug to me but please review. That said, I'm not sure if it's 
just Amazon issue or general Hive issue.


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java 4632f08 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorGroupByOperator.java 
90b4b12 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java 
afd1738 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/BucketingSortingInferenceOptimizer.java
 87fba2d 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/BucketingSortingOpProcFactory.java
 82f4243 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java b93a293 
  ql/src/java/org/apache/hadoop/hive/ql/plan/GroupByDesc.java 7a0b0da 

Diff: https://reviews.apache.org/r/29878/diff/


Testing
-------


Thanks,

Navis Ryu

Re: Review Request 29878: Bug with max() together with rank() and grouping sets

Reply via email to