----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17787/#review35518 -----------------------------------------------------------
I think that this solution is too ad-hoc. Still, I suggest using terminal execution blocks or some extended execution blocks to express an union. The essential problem is that Query assumes that the final execution block is only one. In contrast, in the design of Tajo, a simple union clause results in two or more final execution blocks. So, some execution blocks can be missed from the statistic computation phase. The better solution is to force the final execution block to be only one. Fortunately, we already add a terminal execution block to all MasterPlans. You can use a terminal execution block for final statistic computation. If you do so, you also can reuse the existing method SubQuery::computeStatFromUnionBlock. tajo-client/src/main/java/org/apache/tajo/client/TajoClient.java <https://reviews.apache.org/r/17787/#comment66108> If you want to exploit the statistics of table result, you should use CTAS and then get TableDesc, rather than adding new client APIs. Client API changes have to be discussed seriously. In addition, not all query results are kept persistently if the query is not CTAS. tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/ExecutionBlock.java <https://reviews.apache.org/r/17787/#comment66109> - Hyunsik Choi On Feb. 25, 2014, 4:57 p.m., Jung JaeHwa wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/17787/ > ----------------------------------------------------------- > > (Updated Feb. 25, 2014, 4:57 p.m.) > > > Review request for Tajo. > > > Bugs: TAJO-585 > https://issues.apache.org/jira/browse/TAJO-585 > > > Repository: tajo > > > Description > ------- > > I found a bug which TajoCli prints unexpected results in union query. > > For the first, TajoCli prints wrong row numbers as follows: > {code:xml} > tajo> select id from table1 union all select id from table2; > result: > hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0004/RESULT, > 5 rows (21 B) > id > ------------------------------- > 6 > 7 > 8 > 9 > 10 > 1 > 2 > 3 > 4 > 5 > {code} > > And if empty table located on last phase, it just prints zero as follows: > {code:xml} > > tajo> select id from table1 union all select id from table3; > result: > hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0005/RESULT, > 0 rows (10 B) > id > ------------------------------- > {code} > > For reference, I created test tables as follows: > {code:xml} > CREATE EXTERNAL TABLE table1 (id INT4, name TEXT, score FLOAT4, type TEXT) > USING CSV WITH ('csvfile.delimiter'='|') LOCATION > 'hdfs://localhost:9010/tajo/warehouse/table1'; > > CREATE EXTERNAL TABLE table2 (id INT4, name TEXT, score FLOAT4, type TEXT) > USING CSV WITH ('csvfile.delimiter'='|') LOCATION > 'hdfs://localhost:9010/tajo/warehouse/table2'; > > CREATE EXTERNAL TABLE table3 (id INT4, name TEXT, score FLOAT4, type TEXT) > USING CSV WITH ('csvfile.delimiter'='|') LOCATION > 'hdfs://localhost:9010/tajo/warehouse/table3'; > > tajo> select * from table1; > result: > hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0001/RESULT, > 5 rows (60 B) > id, name, score, type > ------------------------------- > 1, ooo, 1.1, a > 2, ppp, 2.3, b > 3, qqq, 3.4, c > 4, rrr, 4.5, d > 5, xxx, 5.6, e > > tajo> select * from table2; > result: > hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0002/RESULT, > 5 rows (61 B) > id, name, score, type > ------------------------------- > 6, ooo, 1.1, a > 7, ppp, 2.3, b > 8, qqq, 3.4, c > 9, rrr, 4.5, d > 10, xxx, 5.6, e > > tajo> select * from table3; > result: > hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0003/RESULT, > 0 rows (0 B) > id, name, score, type > ------------------------------- > {code} > > > Diffs > ----- > > tajo-client/src/main/java/org/apache/tajo/client/TajoClient.java d9c511e > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/ExecutionBlock.java > 7df6b43 > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java > 461c5d5 > > tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/Query.java > 5fafe51 > > tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/LocalTajoTestingUtility.java > 144ca1b > > tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/QueryTestCaseBase.java > e1a231a > tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/TpchTestBase.java > cb1805d > > tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/query/TestUnionQuery.java > 22830bf > > tajo-core/tajo-core-backend/src/test/resources/queries/TestUnionQuery/testUnion11.sql > PRE-CREATION > > tajo-core/tajo-core-backend/src/test/resources/queries/TestUnionQuery/testUnion12.sql > PRE-CREATION > > tajo-core/tajo-core-backend/src/test/resources/results/TestUnionQuery/testUnion11.result > PRE-CREATION > > tajo-core/tajo-core-backend/src/test/resources/results/TestUnionQuery/testUnion12.result > PRE-CREATION > > Diff: https://reviews.apache.org/r/17787/diff/ > > > Testing > ------- > > mvn clean install > mvn clean install -Phcatalog-0.12.0 > > > Thanks, > > Jung JaeHwa > >
