[ https://issues.apache.org/jira/browse/HIVE-13693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jesus Camacho Rodriguez updated HIVE-13693: ------------------------------------------- Status: Patch Available (was: In Progress) > Multi-insert query drops Filter before file output when there is a.val <> > b.val > ------------------------------------------------------------------------------- > > Key: HIVE-13693 > URL: https://issues.apache.org/jira/browse/HIVE-13693 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer > Affects Versions: 2.0.0, 1.3.0, 2.1.0 > Reporter: Jesus Camacho Rodriguez > Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13693.01.patch, HIVE-13693.01.patch, > HIVE-13693.patch > > > To reproduce: > {noformat} > CREATE TABLE T_A ( id STRING, val STRING ); > CREATE TABLE T_B ( id STRING, val STRING ); > CREATE TABLE join_result_1 ( ida STRING, vala STRING, idb STRING, valb STRING > ); > CREATE TABLE join_result_3 ( ida STRING, vala STRING, idb STRING, valb STRING > ); > INSERT INTO TABLE T_A > VALUES ('Id_1', 'val_101'), ('Id_2', 'val_102'), ('Id_3', 'val_103'); > INSERT INTO TABLE T_B > VALUES ('Id_1', 'val_103'), ('Id_2', 'val_104'); > explain > FROM T_A a LEFT JOIN T_B b ON a.id = b.id > INSERT OVERWRITE TABLE join_result_1 > SELECT a.*, b.* > WHERE b.id = 'Id_1' AND b.val = 'val_103' > INSERT OVERWRITE TABLE join_result_3 > SELECT a.*, b.* > WHERE b.val = 'val_104' AND b.id = 'Id_2' AND a.val <> b.val; > {noformat} > The (wrong) plan is the following: > {noformat} > STAGE DEPENDENCIES: > Stage-2 is a root stage > Stage-3 depends on stages: Stage-2 > Stage-0 depends on stages: Stage-3 > Stage-4 depends on stages: Stage-0 > Stage-1 depends on stages: Stage-3 > Stage-5 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-2 > Tez > DagId: haha_20160504140944_174465c9-5d1a-42f9-9665-fae02eeb2767:2 > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 3 (SIMPLE_EDGE) > DagName: > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: a > Statistics: Num rows: 3 Data size: 36 Basic stats: COMPLETE > Column stats: NONE > Reduce Output Operator > key expressions: id (type: string) > sort order: + > Map-reduce partition columns: id (type: string) > Statistics: Num rows: 3 Data size: 36 Basic stats: > COMPLETE Column stats: NONE > value expressions: val (type: string) > Map 3 > Map Operator Tree: > TableScan > alias: b > Statistics: Num rows: 2 Data size: 24 Basic stats: COMPLETE > Column stats: NONE > Reduce Output Operator > key expressions: id (type: string) > sort order: + > Map-reduce partition columns: id (type: string) > Statistics: Num rows: 2 Data size: 24 Basic stats: > COMPLETE Column stats: NONE > value expressions: val (type: string) > Reducer 2 > Reduce Operator Tree: > Merge Join Operator > condition map: > Left Outer Join0 to 1 > keys: > 0 id (type: string) > 1 id (type: string) > outputColumnNames: _col0, _col1, _col6 > Statistics: Num rows: 3 Data size: 39 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: _col0 (type: string), _col1 (type: string), > 'Id_1' (type: string), 'val_103' (type: string) > outputColumnNames: _col0, _col1, _col2, _col3 > Statistics: Num rows: 3 Data size: 39 Basic stats: COMPLETE > Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 3 Data size: 39 Basic stats: > COMPLETE Column stats: NONE > table: > input format: org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > name: bugtest2.join_result_1 > Filter Operator > predicate: (_col1 <> _col6) (type: boolean) > Statistics: Num rows: 3 Data size: 39 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: _col0 (type: string), _col1 (type: string), > 'Id_2' (type: string), 'val_104' (type: string) > outputColumnNames: _col0, _col1, _col2, _col3 > Statistics: Num rows: 3 Data size: 39 Basic stats: > COMPLETE Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 3 Data size: 39 Basic stats: > COMPLETE Column stats: NONE > table: > input format: > org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > name: bugtest2.join_result_3 > Stage: Stage-3 > Dependency Collection > Stage: Stage-0 > Move Operator > tables: > replace: true > table: > input format: org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > name: bugtest2.join_result_1 > Stage: Stage-4 > Stats-Aggr Operator > Stage: Stage-1 > Move Operator > tables: > replace: true > table: > input format: org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > name: bugtest2.join_result_3 > Stage: Stage-5 > Stats-Aggr Operator > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)