[
https://issues.apache.org/jira/browse/HIVE-9347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14282238#comment-14282238
]
Ashutosh Chauhan commented on HIVE-9347:
----------------------------------------
+1
> Bug with max() together with rank() and grouping sets
> -----------------------------------------------------
>
> Key: HIVE-9347
> URL: https://issues.apache.org/jira/browse/HIVE-9347
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 0.14.0, 0.13.1
> Environment: Amazon Elastic Map Reduce, AMI 3.3.1, Hadoop Amazon
> 2.4.0, Hive 0.13.1
> Reporter: Michal Krawczyk
> Assignee: Navis
> Attachments: HIVE-9347.1.patch.txt, HIVE-9347.2.patch.txt,
> HIVE-9347.3.patch.txt
>
>
> It looks like the query below returns incorrect results on Hive 0.13.1, but
> it was working fine on Hive 0.11.
> I have the following table:
> CREATE TABLE `t`(
> `category` int,
> `live` int,
> `comments` int)
> with the following data:
> hive> select * from t;
> OK
> 3 0 2
> 2 0 2
> 8 0 2
> The query:
> hive> select category, max(live) live, max(comments) comments, rank() OVER
> (PARTITION BY category ORDER BY comments) rank1
> FROM t
> GROUP BY category
> GROUPING SETS ((), (category))
> HAVING max(comments) > 0;
> return the following results:
> NULL 1 48 1
> 2 1 49 1
> 3 1 49 1
> 8 1 49 1
> When using grouping sets with the rank() function the max() function return
> incorrect results. Everything works fine if I remove grouping sets clause and
> split the query into two independent queries or remove the rank() function.
> This looks like a bug to me but please review. That said, I'm not sure if
> it's just Amazon issue or general Hive issue.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)