-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17905/
-----------------------------------------------------------
Review request for Tajo.
Bugs: TAJO-593
https://issues.apache.org/jira/browse/TAJO-593
Repository: tajo
Description
-------
See the following query case:
{code:sql}
select count(*) from (select l_orderkey, l_partkey, count(*) from lineitem
group by l_orderkey, l_partkey) t1;
{code}
In this case, SubQuery::calculateShuffleOutputNum() are used two times for
choosing the number of shuffle outputs. At that time,
SubQuery::calculateShuffleOutputNum() method finds GroupByNode to know the
number of grouping keys. Here is one bug. SubQuery::calculateShuffleOutputNum()
always the topmost GroupByNode. In most cases, it work well. But, outer groupby
and groupby in derived table can cause the problem. In this case, we must use
the most bottom groupby node. Actually, it is always the correct way.
This patch fixes SubQuery::calculateShuffleOutputNum() to use the most bottom
groupby node.
Diffs
-----
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PlannerUtil.java
b59cddafadd0c254aaef97c482cacab6ca4742c1
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/SubQuery.java
83a593a3cd858c96bdde935306a51f545f8971cf
Diff: https://reviews.apache.org/r/17905/diff/
Testing
-------
Thanks,
Hyunsik Choi