-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18764/#review36233
-----------------------------------------------------------
Thank you for the bug fixing. The patch looks nice for me.
Additionally, I think that it needs more unit tests because there more cases
than we can expect.
Against this kind of bugs, I have been usually tested on the following cases:
case 1. some shuffle required operator (e.g., groupby, join, sort) and then
groupby
case 2. some shuffle required operator and then orderby
case 3. some shuffle required operator and then outer union
case 4. consecutive shuffle required operators; e.g., groupby and outer groupby
Currently, you added the left outer join and then sort. Additionally, the patch
would be great to add the following combinations:
Case 1: empty left outer join and then groupby
select
c_custkey,
sum(empty_orders.o_orderkey)
sum(empty_orders.o_orderstatus)
sum(empty_orders.o_orderdate)
from
customer left outer join empty_orders on c_custkey = o_orderkey
Case 2: empty left outer join and groupby, and outer groupby
select count(*) from (
select
c_custkey,
sum(empty_orders.o_orderkey) as total1
sum(empty_orders.o_orderstatus) as total2
sum(empty_orders.o_orderdate) as total3
from
customer left outer join empty_orders on c_custkey = o_orderkey
)
group by
c_custkey
Case 3: empty left outer join and union
select * from (
select
c_custkey,
sum(empty_orders.o_orderkey)
sum(empty_orders.o_orderstatus)
sum(empty_orders.o_orderdate)
from
customer left outer join empty_orders on c_custkey = o_orderkey
union
select
c_custkey,
sum(empty_orders.o_orderkey)
sum(empty_orders.o_orderstatus)
sum(empty_orders.o_orderdate)
from
customer left outer join empty_orders on c_custkey = o_orderkey
) t1
Case 4: right empty outer join and then order by
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/logical/ShuffleFileWriteNode.java
<https://reviews.apache.org/r/18764/#comment67173>
You may intend zero rather than zeo.
- Hyunsik Choi
On March 5, 2014, 5:39 p.m., Jung JaeHwa wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18764/
> -----------------------------------------------------------
>
> (Updated March 5, 2014, 5:39 p.m.)
>
>
> Review request for Tajo.
>
>
> Bugs: TAJO-427
> https://issues.apache.org/jira/browse/TAJO-427
>
>
> Repository: tajo
>
>
> Description
> -------
>
> If empty table use at LEFT OUTER JOIN clause, it makes
> IndexOutOfBoundsException as follows:
> {code:xml}
> tajo> select * from table1;
> Progress: 100%, response time: 0.146 sec
> final state: QUERY_SUCCEEDED, response time: 0.146 sec
> result:
> hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1387348492386_0013/RESULT,
> 5 rows (60 B)
> id, name, score, type
> -------------------------------
> 1, ooo, 1.1, a
> 2, ppp, 2.3, b
> 3, qqq, 3.4, c
> 4, rrr, 4.5, d
> 5, xxx, 5.6, e
> tajo> select * from table3;
> Progress: 100%, response time: 0.053 sec
> final state: QUERY_SUCCEEDED, response time: 0.053 sec
> result:
> hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1387348492386_0014/RESULT,
> 0 rows (0 B)
> id, name, score, type
> -------------------------------
> tajo> select a.id, a.name, a.score, case when b.name is null then 'zzz' else
> b.name end as name2 from table1 a left outer join table3 b on a.id = b.id;
> Internal error!
> {code}
>
> Tajo master makes log as follows:
> {code:xml}
>
> 2013-12-18 15:39:25,678 INFO service.AbstractService
> (AbstractService.java:start(94)) -
> Service:org.apache.tajo.worker.AbstractResourceAllocator is started.
> 2013-12-18 15:39:25,678 INFO service.AbstractService
> (AbstractService.java:start(94)) -
> Service:org.apache.tajo.master.TajoAsyncDispatcher is started.
> 2013-12-18 15:39:25,679 INFO master.TajoAsyncDispatcher
> (TajoAsyncDispatcher.java:start(101)) - AsyncDispatcher
> started:q_1387348492386_0015
> 2013-12-18 15:39:25,679 INFO service.AbstractService
> (AbstractService.java:start(94)) -
> Service:org.apache.tajo.master.querymaster.QueryMasterTask is started.
> 2013-12-18 15:39:25,679 INFO querymaster.Query (Query.java:handle(452)) -
> Processing q_1387348492386_0015 of type START
> 2013-12-18 15:39:25,682 INFO storage.AbstractStorageManager
> (AbstractStorageManager.java:listStatus(384)) - Total input paths to process
> : 0
> 2013-12-18 15:39:25,682 INFO storage.AbstractStorageManager
> (AbstractStorageManager.java:getSplits(612)) - Total # of splits: 0
> 2013-12-18 15:39:25,682 ERROR querymaster.SubQuery
> (SubQuery.java:transition(529)) - SubQuery (eb_1387348492386_0015_000003)
> ERROR:
> java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
> at java.util.ArrayList.RangeCheck(ArrayList.java:547)
> at java.util.ArrayList.get(ArrayList.java:322)
> at
> org.apache.tajo.master.querymaster.Repartitioner.createJoinTasks(Repartitioner.java:96)
> at
> org.apache.tajo.master.querymaster.SubQuery$InitAndRequestContainer.createTasks(SubQuery.java:663)
> at
> org.apache.tajo.master.querymaster.SubQuery$InitAndRequestContainer.transition(SubQuery.java:517)
> at
> org.apache.tajo.master.querymaster.SubQuery$InitAndRequestContainer.transition(SubQuery.java:499)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:382)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:299)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:445)
> at org.apache.tajo.master.querymaster.SubQuery.handle(SubQuery.java:476)
> at
> org.apache.tajo.master.querymaster.Query$StartTransition.transition(Query.java:288)
> at
> org.apache.tajo.master.querymaster.Query$StartTransition.transition(Query.java:277)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:359)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:299)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:445)
> at org.apache.tajo.master.querymaster.Query.handle(Query.java:457)
> at org.apache.tajo.master.querymaster.Query.handle(Query.java:54)
> at
> org.apache.tajo.master.TajoAsyncDispatcher.dispatch(TajoAsyncDispatcher.java:137)
> at
> org.apache.tajo.master.TajoAsyncDispatcher$1.run(TajoAsyncDispatcher.java:79)
> at java.lang.Thread.run(Thread.java:680)
> {code}
>
>
> Diffs
> -----
>
>
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/logical/ShuffleFileWriteNode.java
> 5399357
>
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/utils/TupleUtil.java
> 6d801dd
>
> tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/query/TestJoinQuery.java
> 0e925f1
>
> tajo-core/tajo-core-backend/src/test/resources/queries/TestJoinQuery/testLeftOuterJoin1.sql
> f946e1d
>
> tajo-core/tajo-core-backend/src/test/resources/queries/TestJoinQuery/testLeftOuterJoinWithEmptyTable.sql
> PRE-CREATION
>
> tajo-core/tajo-core-backend/src/test/resources/results/TestJoinQuery/testLeftOuterJoin1.result
> 81dc055
>
> tajo-core/tajo-core-backend/src/test/resources/results/TestJoinQuery/testLeftOuterJoinWithEmptyTable.result
> PRE-CREATION
>
> Diff: https://reviews.apache.org/r/18764/diff/
>
>
> Testing
> -------
>
> mvn clean install
>
>
> Thanks,
>
> Jung JaeHwa
>
>