----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18210/#review34987 -----------------------------------------------------------
Thank you for the review. I've fixed all of them you mentioned. And, I've committed it to master branch. - Hyunsik Choi On Feb. 20, 2014, 2:28 p.m., Hyunsik Choi wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/18210/ > ----------------------------------------------------------- > > (Updated Feb. 20, 2014, 2:28 p.m.) > > > Review request for Tajo. > > > Bugs: TAJO-601 > https://issues.apache.org/jira/browse/TAJO-601 > > > Repository: tajo > > > Description > ------- > > Currently, distinct aggregation queries are executed as follows: > * the first stage: it just shuffles tuples by hashing grouping keys. > * the second stage: it sorts them and executes sort aggregation. > > This way executes queries including distinct aggregation functions with only > two stages. But, it leads to large intermediate data during shuffle phase. > > This kind of query can be rewritten as two queries: > > [Original query] > ---------- > SELECT grp1, grp2, count(*) as total, count(distinct grp3) as distinct_col > from rel1 group by grp1, grp2; > ---------- > > [Rewritten query] > ---------- > SELECT grp1, grp2, sum(cnt) as total, count(grp3) as distinct_col from ( > SELECT grp1, grp2, grp3, count(*) as cnt from rel1 group by grp1, grp2, > grp3) tmp1 group by grp1, grp2 > ) table1; > ---------- > > I'm expecting that this rewrite will significantly reduce the intermediate > data volume and query response time in most cases. > > > Diffs > ----- > > > tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/catalog/SortSpec.java > 3ef73d5c5385b40fcfb3b0ecbbc35b783224c760 > tajo-common/src/main/java/org/apache/tajo/util/TUtil.java > cc694d43f42f68945cf53a7b8b9bbdca97a4f205 > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/eval/EvalTreeUtil.java > da05739b8feff0e04b1762f8000b1f3818c773a2 > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/eval/FunctionEval.java > 0555bdec8aff6fa79c02b640c81ad55d4666b90a > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumDoubleDistinct.java > PRE-CREATION > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumFloat.java > 10fd7205f29c82adf87816737598ce762ee0ebc9 > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumFloatDistinct.java > PRE-CREATION > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumIntDistinct.java > PRE-CREATION > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumLongDistinct.java > PRE-CREATION > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/ExprsVerifier.java > b14c448ee5b3ce0dfca67c6a9b942f1803cc91f9 > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/LogicalPlanner.java > f7c0bfab78cb3416e7a2ed263cc362917023e3ca > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PhysicalPlannerImpl.java > 67f56303e04787bf950c4a9a703faec58fb74cd4 > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PlannerUtil.java > 7d5e2fc7e085cc36527383a208277384035263e7 > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PreLogicalPlanVerifier.java > 6dac031218c650b9c1c86811b4552fe6d82da0c1 > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/enforce/Enforcer.java > dd46996eca7eb9c38f87d97813f5dcc7220429ed > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/DataChannel.java > 9f5c6bf9dd7b549308724ce1e8044aff1630cef1 > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java > f390b52f378a2d7e84e40876df4a4b416af912ef > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/MasterPlan.java > 91f658dab395620f5a891f51407b3676b07a8fa5 > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/ExternalSortExec.java > 791781e526c54f216152e935682bc2c3147a9e0c > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/SeqScanExec.java > 53a1c24197c40c77153f79f90c05882c90aae957 > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/FilterPushDownRule.java > 399903c66bb8a62074facd0bbbe9b3b8e891c067 > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/PartitionedTableRewriter.java > e5f7fb40414e0b2e2e40bccebe24069ee4d9301b > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/ProjectionPushDownRule.java > 633d0c1857533b02c4ecc6913c740fd2e3722845 > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMaster.java > ae6d5ebb97f8c4287ffd11262b2932d2f8b1250c > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMasterManagerService.java > 3c30e3854abaa891f72b368144942164e5dffab7 > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/Repartitioner.java > 56c26797aad1dbe95945567961e9425fef72fa96 > > tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/eval/TestEvalTreeUtil.java > d7562426647a6a9d6aae5207a67ddcdd03d0ee3a > > tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/query/TestGroupByQuery.java > 1f80bce23c74e3abdcbf9bc0553ec30244d6bd93 > > tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/master/TestExecutionBlockCursor.java > 053c02833e80dd931807fa6314965e687d7b26c0 > > tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/master/TestGlobalPlanner.java > 2d3124d7e9d7853b0f872eee1016cbae504c9c6b > > tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testCountDistinct.sql > > > tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testCountDistinct2.sql > > > tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation3.sql > PRE-CREATION > > tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation4.sql > PRE-CREATION > > tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation5.sql > PRE-CREATION > > tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregationWithHaving1.sql > PRE-CREATION > > tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregationWithUnion1.sql > PRE-CREATION > > tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testCountDistinct.result > > > tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testCountDistinct2.result > > > tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation3.result > PRE-CREATION > > tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation4.result > PRE-CREATION > > tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation5.result > PRE-CREATION > > tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregationWithHaving1.result > PRE-CREATION > > tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregationWithUnion1.result > PRE-CREATION > tajo-storage/src/main/java/org/apache/tajo/storage/RawFile.java > c3a7525154e0f36d51dcca211949f21f57a9f1c8 > > Diff: https://reviews.apache.org/r/18210/diff/ > > > Testing > ------- > > mvn clean install > > > Thanks, > > Hyunsik Choi > >
