> On Feb. 26, 2014, 2 p.m., Hyunsik Choi wrote: > > I think that this solution is too ad-hoc. Still, I suggest using terminal > > execution blocks or some extended execution blocks to express an union. > > > > The essential problem is that Query assumes that the final execution block > > is only one. In contrast, in the design of Tajo, a simple union clause > > results in two or more final execution blocks. So, some execution blocks > > can be missed from the statistic computation phase. > > > > The better solution is to force the final execution block to be only one. > > Fortunately, we already add a terminal execution block to all MasterPlans. > > You can use a terminal execution block for final statistic computation. If > > you do so, you also can reuse the existing method > > SubQuery::computeStatFromUnionBlock.
Hi Hyunsik. Thank you for your review. I also thought that this patch is not the intimate solution. So after I uploaded the this patch, I have tried to find better solution. I'll update this patch by your advice. Cheers. - Jung ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17787/#review35518 ----------------------------------------------------------- On Feb. 25, 2014, 7:57 a.m., Jung JaeHwa wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/17787/ > ----------------------------------------------------------- > > (Updated Feb. 25, 2014, 7:57 a.m.) > > > Review request for Tajo. > > > Bugs: TAJO-585 > https://issues.apache.org/jira/browse/TAJO-585 > > > Repository: tajo > > > Description > ------- > > I found a bug which TajoCli prints unexpected results in union query. > > For the first, TajoCli prints wrong row numbers as follows: > {code:xml} > tajo> select id from table1 union all select id from table2; > result: > hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0004/RESULT, > 5 rows (21 B) > id > ------------------------------- > 6 > 7 > 8 > 9 > 10 > 1 > 2 > 3 > 4 > 5 > {code} > > And if empty table located on last phase, it just prints zero as follows: > {code:xml} > > tajo> select id from table1 union all select id from table3; > result: > hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0005/RESULT, > 0 rows (10 B) > id > ------------------------------- > {code} > > For reference, I created test tables as follows: > {code:xml} > CREATE EXTERNAL TABLE table1 (id INT4, name TEXT, score FLOAT4, type TEXT) > USING CSV WITH ('csvfile.delimiter'='|') LOCATION > 'hdfs://localhost:9010/tajo/warehouse/table1'; > > CREATE EXTERNAL TABLE table2 (id INT4, name TEXT, score FLOAT4, type TEXT) > USING CSV WITH ('csvfile.delimiter'='|') LOCATION > 'hdfs://localhost:9010/tajo/warehouse/table2'; > > CREATE EXTERNAL TABLE table3 (id INT4, name TEXT, score FLOAT4, type TEXT) > USING CSV WITH ('csvfile.delimiter'='|') LOCATION > 'hdfs://localhost:9010/tajo/warehouse/table3'; > > tajo> select * from table1; > result: > hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0001/RESULT, > 5 rows (60 B) > id, name, score, type > ------------------------------- > 1, ooo, 1.1, a > 2, ppp, 2.3, b > 3, qqq, 3.4, c > 4, rrr, 4.5, d > 5, xxx, 5.6, e > > tajo> select * from table2; > result: > hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0002/RESULT, > 5 rows (61 B) > id, name, score, type > ------------------------------- > 6, ooo, 1.1, a > 7, ppp, 2.3, b > 8, qqq, 3.4, c > 9, rrr, 4.5, d > 10, xxx, 5.6, e > > tajo> select * from table3; > result: > hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0003/RESULT, > 0 rows (0 B) > id, name, score, type > ------------------------------- > {code} > > > Diffs > ----- > > tajo-client/src/main/java/org/apache/tajo/client/TajoClient.java d9c511e > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/ExecutionBlock.java > 7df6b43 > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java > 461c5d5 > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/Query.java > 5fafe51 > > tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/LocalTajoTestingUtility.java > 144ca1b > > tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/QueryTestCaseBase.java > e1a231a > tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/TpchTestBase.java > cb1805d > > tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/query/TestUnionQuery.java > 22830bf > > tajo-core/tajo-core-backend/src/test/resources/queries/TestUnionQuery/testUnion11.sql > PRE-CREATION > > tajo-core/tajo-core-backend/src/test/resources/queries/TestUnionQuery/testUnion12.sql > PRE-CREATION > > tajo-core/tajo-core-backend/src/test/resources/results/TestUnionQuery/testUnion11.result > PRE-CREATION > > tajo-core/tajo-core-backend/src/test/resources/results/TestUnionQuery/testUnion12.result > PRE-CREATION > > Diff: https://reviews.apache.org/r/17787/diff/ > > > Testing > ------- > > mvn clean install > mvn clean install -Phcatalog-0.12.0 > > > Thanks, > > Jung JaeHwa > >
