-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17787/#review35518
-----------------------------------------------------------


I think that this solution is too ad-hoc. Still, I suggest using terminal 
execution blocks or some extended execution blocks to express an union.

The essential problem is that Query assumes that the final execution block is 
only one. In contrast, in the design of Tajo, a simple union clause results in 
two or more final execution blocks. So, some execution blocks can be missed 
from the statistic computation phase.

The better solution is to force the final execution block to be only one. 
Fortunately, we already add a terminal execution block to all MasterPlans. You 
can use a terminal execution block for final statistic computation. If you do 
so, you also can reuse the existing method SubQuery::computeStatFromUnionBlock.


tajo-client/src/main/java/org/apache/tajo/client/TajoClient.java
<https://reviews.apache.org/r/17787/#comment66108>

    If you want to exploit the statistics of table result, you should use CTAS 
and then get TableDesc, rather than adding new client APIs. Client API changes 
have to be discussed seriously. In addition, not all query results are kept 
persistently if the query is not CTAS.
    



tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/ExecutionBlock.java
<https://reviews.apache.org/r/17787/#comment66109>

    


- Hyunsik Choi


On Feb. 25, 2014, 4:57 p.m., Jung JaeHwa wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17787/
> -----------------------------------------------------------
> 
> (Updated Feb. 25, 2014, 4:57 p.m.)
> 
> 
> Review request for Tajo.
> 
> 
> Bugs: TAJO-585
>     https://issues.apache.org/jira/browse/TAJO-585
> 
> 
> Repository: tajo
> 
> 
> Description
> -------
> 
> I found a bug which TajoCli prints unexpected results in union query.
> 
> For the first, TajoCli prints wrong row numbers as follows:
> {code:xml}
> tajo> select id from table1 union all select id from table2;
> result: 
> hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0004/RESULT, 
> 5 rows (21 B)
> id
> -------------------------------
> 6
> 7
> 8
> 9
> 10
> 1
> 2
> 3
> 4
> 5
> {code}
> 
> And if empty table located on last phase, it just prints zero as follows:
> {code:xml}
> 
> tajo> select id from table1 union all select id from table3;
> result: 
> hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0005/RESULT, 
> 0 rows (10 B)
> id
> -------------------------------
> {code}
> 
> For reference, I created test tables as follows:
> {code:xml}
> CREATE EXTERNAL TABLE table1 (id INT4, name TEXT, score FLOAT4, type TEXT) 
> USING CSV WITH ('csvfile.delimiter'='|') LOCATION 
> 'hdfs://localhost:9010/tajo/warehouse/table1';
> 
> CREATE EXTERNAL TABLE table2 (id INT4, name TEXT, score FLOAT4, type TEXT) 
> USING CSV WITH ('csvfile.delimiter'='|') LOCATION 
> 'hdfs://localhost:9010/tajo/warehouse/table2';
> 
> CREATE EXTERNAL TABLE table3 (id INT4, name TEXT, score FLOAT4, type TEXT) 
> USING CSV WITH ('csvfile.delimiter'='|') LOCATION 
> 'hdfs://localhost:9010/tajo/warehouse/table3';
> 
> tajo> select * from table1;
> result: 
> hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0001/RESULT, 
> 5 rows (60 B)
> id,  name,  score,  type
> -------------------------------
> 1,  ooo,  1.1,  a
> 2,  ppp,  2.3,  b
> 3,  qqq,  3.4,  c
> 4,  rrr,  4.5,  d
> 5,  xxx,  5.6,  e
> 
> tajo> select * from table2;
> result: 
> hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0002/RESULT, 
> 5 rows (61 B)
> id,  name,  score,  type
> -------------------------------
> 6,  ooo,  1.1,  a
> 7,  ppp,  2.3,  b
> 8,  qqq,  3.4,  c
> 9,  rrr,  4.5,  d
> 10,  xxx,  5.6,  e
> 
> tajo> select * from table3;
> result: 
> hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0003/RESULT, 
> 0 rows (0 B)
> id,  name,  score,  type
> -------------------------------
> {code}
> 
> 
> Diffs
> -----
> 
>   tajo-client/src/main/java/org/apache/tajo/client/TajoClient.java d9c511e 
>   
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/ExecutionBlock.java
>  7df6b43 
>   
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java
>  461c5d5 
>   
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/Query.java
>  5fafe51 
>   
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/LocalTajoTestingUtility.java
>  144ca1b 
>   
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/QueryTestCaseBase.java
>  e1a231a 
>   tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/TpchTestBase.java 
> cb1805d 
>   
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/query/TestUnionQuery.java
>  22830bf 
>   
> tajo-core/tajo-core-backend/src/test/resources/queries/TestUnionQuery/testUnion11.sql
>  PRE-CREATION 
>   
> tajo-core/tajo-core-backend/src/test/resources/queries/TestUnionQuery/testUnion12.sql
>  PRE-CREATION 
>   
> tajo-core/tajo-core-backend/src/test/resources/results/TestUnionQuery/testUnion11.result
>  PRE-CREATION 
>   
> tajo-core/tajo-core-backend/src/test/resources/results/TestUnionQuery/testUnion12.result
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/17787/diff/
> 
> 
> Testing
> -------
> 
> mvn clean install
> mvn clean install -Phcatalog-0.12.0
> 
> 
> Thanks,
> 
> Jung JaeHwa
> 
>

Reply via email to