[
https://issues.apache.org/jira/browse/TAJO-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13897773#comment-13897773
]
Hudson commented on TAJO-593:
-----------------------------
SUCCESS: Integrated in Tajo-master-build #61 (See
[https://builds.apache.org/job/Tajo-master-build/61/])
TAJO-593: outer groupby and groupby in derived table causes only one shuffle
output number. (hyunsik:
https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git&a=commit&h=8e859fb103b753ba6c9379bc7625164341c84b25)
*
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/SubQuery.java
*
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PlannerUtil.java
* CHANGES.txt
> outer groupby and groupby in derived table causes only one shuffle output
> number
> --------------------------------------------------------------------------------
>
> Key: TAJO-593
> URL: https://issues.apache.org/jira/browse/TAJO-593
> Project: Tajo
> Issue Type: Bug
> Components: distributed query plan
> Reporter: Hyunsik Choi
> Assignee: Hyunsik Choi
> Fix For: 0.8-incubating
>
> Attachments: TAJO-593.patch
>
>
> See the following query case:
> {code:sql}
> select count(*) from (select l_orderkey, l_partkey, count(*) from lineitem
> group by l_orderkey, l_partkey) t1;
> {code}
> In this case, SubQuery::calculateShuffleOutputNum() are used two times for
> choosing the number of shuffle outputs. At that time,
> SubQuery::calculateShuffleOutputNum() method finds GroupByNode to know the
> number of grouping keys. Here is one bug.
> SubQuery::calculateShuffleOutputNum() always the topmost GroupByNode. In most
> cases, it work well. But, outer groupby and groupby in derived table can
> cause the problem. In this case, we must use the most bottom groupby node.
> Actually, it is always the correct way.
> This patch fixes SubQuery::calculateShuffleOutputNum() to use the most bottom
> groupby node.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)