Chao created HIVE-8908:
--------------------------

             Summary: Investigate test failure on join34.q
                 Key: HIVE-8908
                 URL: https://issues.apache.org/jira/browse/HIVE-8908
             Project: Hive
          Issue Type: Sub-task
          Components: Spark
    Affects Versions: spark-branch
            Reporter: Chao
            Assignee: Chao


For this query, the plan doesn't look correct:

{noformat}
OK
STAGE DEPENDENCIES:
  Stage-4 is a root stage
  Stage-1 depends on stages: Stage-5, Stage-4
  Stage-2 depends on stages: Stage-1
  Stage-0 depends on stages: Stage-2
  Stage-3 depends on stages: Stage-0
  Stage-5 is a root stage

STAGE PLANS:
  Stage: Stage-4
    Spark
      DagName: chao_20141118150101_a47a2d7b-e750-4764-be66-5ba95ebbe433:6
      Vertices:
        Map 4 
            Map Operator Tree:
                TableScan
                  alias: x
                  Statistics: Num rows: 1 Data size: 216 Basic stats: COMPLETE 
Column stats: NONE
                  Filter Operator
                    predicate: key is not null (type: boolean)
                    Statistics: Num rows: 1 Data size: 216 Basic stats: 
COMPLETE Column stats: NONE
                    Spark HashTable Sink Operator
                      condition expressions:
                        0 {_col1}
                        1 {value}
                      keys:
                        0 _col0 (type: string)
                        1 key (type: string)
                    Reduce Output Operator
                      key expressions: key (type: string)
                      sort order: +
                      Map-reduce partition columns: key (type: string)
                      Statistics: Num rows: 1 Data size: 216 Basic stats: 
COMPLETE Column stats: NONE
                      value expressions: value (type: string)
            Local Work:
              Map Reduce Local Work

  Stage: Stage-1
    Spark
      Edges:
        Union 2 <- Map 1 (NONE, 0), Map 3 (NONE, 0)
      DagName: chao_20141118150101_a47a2d7b-e750-4764-be66-5ba95ebbe433:4
      Vertices:
        Map 1 
            Map Operator Tree:
                TableScan
                  alias: x
                  Filter Operator
                    predicate: (key < 20) (type: boolean)
                    Select Operator
                      expressions: key (type: string), value (type: string)
                      outputColumnNames: _col0, _col1
                      Map Join Operator
                        condition map:
                             Inner Join 0 to 1
                        condition expressions:
                          0 {_col1}
                          1 {key} {value}
                        keys:
                          0 _col0 (type: string)
                          1 key (type: string)
                        outputColumnNames: _col1, _col2, _col3
                        input vertices:
                          1 Map 4
                        Select Operator
                          expressions: _col2 (type: string), _col3 (type: 
string), _col1 (type: string)
                          outputColumnNames: _col0, _col1, _col2
                          File Output Operator
                            compressed: false
                            table:
                                input format: 
org.apache.hadoop.mapred.TextInputFormat
                                output format: 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
                                serde: 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
                                name: default.dest_j1
            Local Work:
              Map Reduce Local Work
        Map 3 
            Map Operator Tree:
                TableScan
                  alias: x1
                  Filter Operator
                    predicate: (key > 100) (type: boolean)
                    Select Operator
                      expressions: key (type: string), value (type: string)
                      outputColumnNames: _col0, _col1
                      Map Join Operator
                        condition map:
                             Inner Join 0 to 1
                        condition expressions:
                          0 {_col1}
                          1 {key} {value}
                        keys:
                          0 _col0 (type: string)
                          1 key (type: string)
                        outputColumnNames: _col1, _col2, _col3
                        input vertices:
                          1 Map 4
                        Select Operator
                          expressions: _col2 (type: string), _col3 (type: 
string), _col1 (type: string)
                          outputColumnNames: _col0, _col1, _col2
                          File Output Operator
                            compressed: false
                            table:
                                input format: 
org.apache.hadoop.mapred.TextInputFormat
                                output format: 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
                                serde: 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
                                name: default.dest_j1
            Local Work:
              Map Reduce Local Work
        Union 2 
            Vertex: Union 2

  Stage: Stage-2
    Dependency Collection

  Stage: Stage-0
    Move Operator
      tables:
          replace: true
          table:
              input format: org.apache.hadoop.mapred.TextInputFormat
              output format: 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
              serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
              name: default.dest_j1

  Stage: Stage-3
    Stats-Aggr Operator

  Stage: Stage-5
    Spark
      DagName: chao_20141118150101_a47a2d7b-e750-4764-be66-5ba95ebbe433:5
      Vertices:
        Map 4 
            Map Operator Tree:
                TableScan
                  alias: x
                  Statistics: Num rows: 1 Data size: 216 Basic stats: COMPLETE 
Column stats: NONE
                  Filter Operator
                    predicate: key is not null (type: boolean)
                    Statistics: Num rows: 1 Data size: 216 Basic stats: 
COMPLETE Column stats: NONE
                    Spark HashTable Sink Operator
                      condition expressions:
                        0 {_col1}
                        1 {value}
                      keys:
                        0 _col0 (type: string)
                        1 key (type: string)
                    Reduce Output Operator
                      key expressions: key (type: string)
                      sort order: +
                      Map-reduce partition columns: key (type: string)
                      Statistics: Num rows: 1 Data size: 216 Basic stats: 
COMPLETE Column stats: NONE
                      value expressions: value (type: string)
            Local Work:
              Map Reduce Local Work

Time taken: 0.127 seconds, Fetched: 156 row(s)
{noformat}

Note that Stage-4 and Stage-5 are identical. Also, in Stage-4 there's a 
parallel RS operator with the HTS operator, which is strange.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to