[ 
https://issues.apache.org/jira/browse/TAJO-1361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14340095#comment-14340095
 ] 

Jihoon Son commented on TAJO-1361:
----------------------------------

Here is the Hive's execution plan.
{noformat}
hive> explain select * from                      
    > (select * from test_a  where status='regist')a 
    > left outer join ( select * from test_a where status='start')b 
    > on a.id=b.id and a.id_detail =b.id_detail 
    > where b.id is null and b.id_detail is null;
OK
STAGE DEPENDENCIES:
  Stage-4 is a root stage
  Stage-3 depends on stages: Stage-4
  Stage-0 depends on stages: Stage-3

STAGE PLANS:
  Stage: Stage-4
    Map Reduce Local Work
      Alias -> Map Local Tables:
        b:test_a 
          Fetch Operator
            limit: -1
      Alias -> Map Local Operator Tree:
        b:test_a 
          TableScan
            alias: test_a
            Statistics: Num rows: 2 Data size: 32 Basic stats: COMPLETE Column 
stats: NONE
            Filter Operator
              predicate: (status = 'start') (type: boolean)
              Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE 
Column stats: NONE
              Select Operator
                expressions: id (type: int), id_detail (type: int), 'start' 
(type: string)
                outputColumnNames: _col0, _col1, _col2
                Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE 
Column stats: NONE
                HashTable Sink Operator
                  condition expressions:
                    0 {_col0} {_col1} {_col2}
                    1 {_col2}
                  keys:
                    0 _col0 (type: int), _col1 (type: int)
                    1 _col0 (type: int), _col1 (type: int)

  Stage: Stage-3
    Map Reduce
      Map Operator Tree:
          TableScan
            alias: test_a
            Statistics: Num rows: 2 Data size: 32 Basic stats: COMPLETE Column 
stats: NONE
            Filter Operator
              predicate: (status = 'regist') (type: boolean)
              Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE 
Column stats: NONE
              Select Operator
                expressions: id (type: int), id_detail (type: int), 'regist' 
(type: string)
                outputColumnNames: _col0, _col1, _col2
                Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE 
Column stats: NONE
                Map Join Operator
                  condition map:
                       Left Outer Join0 to 1
                  condition expressions:
                    0 {_col0} {_col1} {_col2}
                    1 {_col0} {_col1} {_col2}
                  keys:
                    0 _col0 (type: int), _col1 (type: int)
                    1 _col0 (type: int), _col1 (type: int)
                  outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
                  Statistics: Num rows: 1 Data size: 17 Basic stats: COMPLETE 
Column stats: NONE
                  Filter Operator
                    predicate: (_col3 is null and _col4 is null) (type: boolean)
                    Statistics: Num rows: 0 Data size: 0 Basic stats: NONE 
Column stats: NONE
                    Select Operator
                      expressions: _col0 (type: int), _col1 (type: int), _col2 
(type: string), null (type: void), null (type: void), _col5 (type: string)
                      outputColumnNames: _col0, _col1, _col2, _col3, _col4, 
_col5
                      Statistics: Num rows: 0 Data size: 0 Basic stats: NONE 
Column stats: NONE
                      File Output Operator
                        compressed: false
                        Statistics: Num rows: 0 Data size: 0 Basic stats: NONE 
Column stats: NONE
                        table:
                            input format: 
org.apache.hadoop.mapred.TextInputFormat
                            output format: 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
                            serde: 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
      Local Work:
        Map Reduce Local Work

  Stage: Stage-0
    Fetch Operator
      limit: -1
      Processor Tree:
        ListSink
{noformat}

> Unexpected outer join behaviours
> --------------------------------
>
>                 Key: TAJO-1361
>                 URL: https://issues.apache.org/jira/browse/TAJO-1361
>             Project: Tajo
>          Issue Type: Bug
>          Components: planner/optimizer
>            Reporter: Jihoon Son
>            Assignee: Jihoon Son
>            Priority: Critical
>
> This bug is reported at Apache Tajo Korea User Group 
> https://groups.google.com/forum/#!topic/tajo-user-kr/srFllmbThG0.
> The bug can be reproduced as follows.
> {noformat}
> default> \dfs -cat /test/test.tbl
> 1,1,regist
> 1,2,regist
> 1,1,start
> default> create external table test_a ( id int , id_detail int , status text) 
> using text with ('csvfile.delimiter'=',') location '/test';
> OK
> default> select * from 
> > (select * from test_a  where status='regist')a 
> > left outer join ( select * from test_a where status='start')b 
> > on a.id=b.id and a.id_detail =b.id_detail 
> > where b.id is null and b.id_detail is null;
> Progress: 100%, response time: 1.57 sec
> id,  id_detail,  status,  id,  id_detail,  status
> -------------------------------
> 1,  1,  regist,  1,  1,  start
> 1,  2,  regist,  ,  ,  
> (2 rows, 1.57 sec, 37 B selected)
> {noformat}
> The expected query result is :
> {noformat}
> id,  id_detail,  status,  id,  id_detail,  status
> 1,  2,  regist,  ,  ,  
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to