[jira] [Commented] (TAJO-593) outer groupby and groupby in derived table causes only one shuffle output number

Hudson (JIRA) Tue, 11 Feb 2014 03:57:43 -0800

    [ 
https://issues.apache.org/jira/browse/TAJO-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13897773#comment-13897773
 ]


Hudson commented on TAJO-593:
-----------------------------

SUCCESS: Integrated in Tajo-master-build #61 (See 
[https://builds.apache.org/job/Tajo-master-build/61/])
TAJO-593: outer groupby and groupby in derived table causes only one shuffle 
output number. (hyunsik: 
https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git&a=commit&h=8e859fb103b753ba6c9379bc7625164341c84b25)
* 
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/SubQuery.java
* 
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PlannerUtil.java
* CHANGES.txt


> outer groupby and groupby in derived table causes only one shuffle output 
> number
> --------------------------------------------------------------------------------
>
>                 Key: TAJO-593
>                 URL: https://issues.apache.org/jira/browse/TAJO-593
>             Project: Tajo
>          Issue Type: Bug
>          Components: distributed query plan
>            Reporter: Hyunsik Choi
>            Assignee: Hyunsik Choi
>             Fix For: 0.8-incubating
>
>         Attachments: TAJO-593.patch
>
>
> See the following query case:
> {code:sql}
> select count(*) from (select l_orderkey, l_partkey, count(*) from lineitem 
> group by l_orderkey, l_partkey) t1;
> {code}
> In this case, SubQuery::calculateShuffleOutputNum() are used two times for 
> choosing the number of shuffle outputs. At that time, 
> SubQuery::calculateShuffleOutputNum() method finds GroupByNode to know the 
> number of grouping keys. Here is one bug. 
> SubQuery::calculateShuffleOutputNum() always the topmost GroupByNode. In most 
> cases, it work well. But, outer groupby and groupby in derived table can 
> cause the problem. In this case, we must use the most bottom groupby node. 
> Actually, it is always the correct way.
> This patch fixes SubQuery::calculateShuffleOutputNum() to use the most bottom 
> groupby node.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (TAJO-593) outer groupby and groupby in derived table causes only one shuffle output number

Reply via email to