[
https://issues.apache.org/jira/browse/HIVE-5358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779645#comment-13779645
]
Chun Chen commented on HIVE-5358:
---------------------------------
Sorry for the misunderstand the intention of checkExprs in
ReduceSinkDeDuplication.
[~ashutoshc] I will try to preserve the order of key Columns on RS in those
test cases.
{code}
select c3, c2 from (select c1, c2, c3 from t1 order by c1, c2, c3) t group by
c3, c2;
{code}
[~yhuai] I don't understand what you mean about the above sql. If we use [c3,
c2] as key columns, what's the problem of that?
> ReduceSinkDeDuplication should ignore column orders when check overlapping
> part of keys between parent and child
> ----------------------------------------------------------------------------------------------------------------
>
> Key: HIVE-5358
> URL: https://issues.apache.org/jira/browse/HIVE-5358
> Project: Hive
> Issue Type: Improvement
> Components: Query Processor
> Reporter: Chun Chen
> Assignee: Chun Chen
> Attachments: D13113.1.patch, HIVE-5358.2.patch, HIVE-5358.patch
>
>
> {code}
> select key, value from (select key, value from src group by key, value) t
> group by key, value;
> {code}
> This can be optimized by ReduceSinkDeDuplication
> {code}
> select key, value from (select key, value from src group by key, value) t
> group by value, key;
> {code}
> However the sql above can't be optimized by ReduceSinkDeDuplication currently
> due to different column orders of parent and child operator.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira