Gopal V created HIVE-16976: ------------------------------ Summary: DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN Key: HIVE-16976 URL: https://issues.apache.org/jira/browse/HIVE-16976 Project: Hive Issue Type: Improvement Components: Tez Affects Versions: 2.1.1, 3.0.0 Reporter: Gopal V
Tez DPP does not kick in for scenarios where a user wants to run a comparison clause instead of a JOIN/IN clause. {code} explain select count(1) from store_sales where ss_sold_date_sk > (select max(d_Date_sk) from date_dim where d_year = 2017); Warning: Map Join MAPJOIN[21][bigTable=?] in task 'Map 1' is a cross product OK Plan optimized by CBO. Vertex dependency in root stage Map 1 <- Reducer 4 (BROADCAST_EDGE) Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE) Reducer 4 <- Map 3 (CUSTOM_SIMPLE_EDGE) Stage-0 Fetch Operator limit:-1 Stage-1 Reducer 2 vectorized, llap File Output Operator [FS_36] Group By Operator [GBY_35] (rows=1 width=8) Output:["_col0"],aggregations:["count(VALUE._col0)"] <-Map 1 [CUSTOM_SIMPLE_EDGE] vectorized, llap PARTITION_ONLY_SHUFFLE [RS_34] Group By Operator [GBY_33] (rows=1 width=8) Output:["_col0"],aggregations:["count(1)"] Select Operator [SEL_32] (rows=9600142089 width=16) Filter Operator [FIL_31] (rows=9600142089 width=16) predicate:(_col0 > _col1) Map Join Operator [MAPJOIN_30] (rows=28800426268 width=16) Conds:(Inner),Output:["_col0","_col1"] <-Reducer 4 [BROADCAST_EDGE] vectorized, llap BROADCAST [RS_28] Group By Operator [GBY_27] (rows=1 width=8) Output:["_col0"],aggregations:["max(VALUE._col0)"] <-Map 3 [CUSTOM_SIMPLE_EDGE] vectorized, llap PARTITION_ONLY_SHUFFLE [RS_26] Group By Operator [GBY_25] (rows=1 width=8) Output:["_col0"],aggregations:["max(d_date_sk)"] Select Operator [SEL_24] (rows=652 width=12) Output:["d_date_sk"] Filter Operator [FIL_23] (rows=652 width=12) predicate:(d_year = 2017) TableScan [TS_2] (rows=73049 width=12) tpcds_bin_partitioned_newschema_orc_10000@date_dim,date_dim,Tbl:COMPLETE,Col:COMPLETE,Output:["d_date_sk","d_year"] <-Select Operator [SEL_29] (rows=28800426268 width=8) Output:["_col0"] TableScan [TS_0] (rows=28800426268 width=172) tpcds_bin_partitioned_newschema_orc_10000@store_sales,store_sales,Tbl:COMPLETE,Col:COMPLETE {code} The SyntheticJoinPredicate is only injected for equi joins, not for < or > scalar subqueries. -- This message was sent by Atlassian JIRA (v6.4.14#64029)