[ https://issues.apache.org/jira/browse/HIVE-8747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mostafa Mokhtar updated HIVE-8747: ---------------------------------- Description: ship_mode table has 0 rows. Query {code} select count(*) from web_sales ,date_dim ,ship_mode where web_sales.ws_sold_date_sk = date_dim.d_date_sk and web_sales.ws_ship_mode_sk = ship_mode.sm_ship_mode_sk and d_year = 2002 and sm_carrier in ('DIAMOND','AIRBORNE') {code} Explain {code} STAGE PLANS: Stage: Stage-1 Tez Edges: Map 1 <- Map 4 (BROADCAST_EDGE) Map 4 <- Map 3 (BROADCAST_EDGE) Reducer 2 <- Map 1 (SIMPLE_EDGE) DagName: mmokhtar_20141105180404_59e6fb65-529f-4eaa-9446-7f34d12bffac:30 Vertices: Map 1 Map Operator Tree: TableScan alias: ship_mode filterExpr: ((sm_carrier) IN ('DIAMOND', 'AIRBORNE') and sm_ship_mode_sk is not null) (type: boolean) Statistics: Num rows: 0 Data size: 45 Basic stats: PARTIAL Column stats: COMPLETE Filter Operator predicate: ((sm_carrier) IN ('DIAMOND', 'AIRBORNE') and sm_ship_mode_sk is not null) (type: boolean) Statistics: Num rows: 9223372036854775807 Data size: 9223372036854775807 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: sm_ship_mode_sk (type: int) outputColumnNames: _col0 Statistics: Num rows: 9223372036854775807 Data size: 9223372036854775807 Basic stats: COMPLETE Column stats: COMPLETE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 1 keys: 0 _col1 (type: int) 1 _col0 (type: int) input vertices: 0 Map 4 Statistics: Num rows: 9223372036854775807 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE Select Operator Statistics: Num rows: 9223372036854775807 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE Group By Operator aggregations: count() mode: hash outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator sort order: Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col0 (type: bigint) Execution mode: vectorized Map 3 Map Operator Tree: TableScan alias: date_dim filterExpr: ((d_year = 2002) and d_date_sk is not null) (type: boolean) Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ((d_year = 2002) and d_date_sk is not null) (type: boolean) Statistics: Num rows: 652 Data size: 5216 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: d_date_sk (type: int) outputColumnNames: _col0 Statistics: Num rows: 652 Data size: 2608 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) Statistics: Num rows: 652 Data size: 2608 Basic stats: COMPLETE Column stats: COMPLETE Execution mode: vectorized Map 4 Map Operator Tree: TableScan alias: web_sales filterExpr: (ws_sold_date_sk is not null and ws_ship_mode_sk is not null) (type: boolean) Statistics: Num rows: 143966864 Data size: 19577477788 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (ws_sold_date_sk is not null and ws_ship_mode_sk is not null) (type: boolean) Statistics: Num rows: 143948856 Data size: 1151518824 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ws_sold_date_sk (type: int), ws_ship_mode_sk (type: int) outputColumnNames: _col0, _col1 Statistics: Num rows: 143948856 Data size: 1151518824 Basic stats: COMPLETE Column stats: COMPLETE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col1} 1 keys: 0 _col0 (type: int) 1 _col0 (type: int) outputColumnNames: _col1 input vertices: 1 Map 3 Statistics: Num rows: 1284818 Data size: 5139272 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col1 (type: int) sort order: + Map-reduce partition columns: _col1 (type: int) Statistics: Num rows: 1284818 Data size: 5139272 Basic stats: COMPLETE Column stats: COMPLETE Execution mode: vectorized Reducer 2 Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0) mode: mergepartial outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: _col0 (type: bigint) outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Execution mode: vectorized Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink {code} was: Query {code} select count(*) from web_sales ,date_dim ,ship_mode where web_sales.ws_sold_date_sk = date_dim.d_date_sk and web_sales.ws_ship_mode_sk = ship_mode.sm_ship_mode_sk and d_year = 2002 and sm_carrier in ('DIAMOND','AIRBORNE') {code} Explain {code} STAGE PLANS: Stage: Stage-1 Tez Edges: Map 1 <- Map 4 (BROADCAST_EDGE) Map 4 <- Map 3 (BROADCAST_EDGE) Reducer 2 <- Map 1 (SIMPLE_EDGE) DagName: mmokhtar_20141105180404_59e6fb65-529f-4eaa-9446-7f34d12bffac:30 Vertices: Map 1 Map Operator Tree: TableScan alias: ship_mode filterExpr: ((sm_carrier) IN ('DIAMOND', 'AIRBORNE') and sm_ship_mode_sk is not null) (type: boolean) Statistics: Num rows: 0 Data size: 45 Basic stats: PARTIAL Column stats: COMPLETE Filter Operator predicate: ((sm_carrier) IN ('DIAMOND', 'AIRBORNE') and sm_ship_mode_sk is not null) (type: boolean) Statistics: Num rows: 9223372036854775807 Data size: 9223372036854775807 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: sm_ship_mode_sk (type: int) outputColumnNames: _col0 Statistics: Num rows: 9223372036854775807 Data size: 9223372036854775807 Basic stats: COMPLETE Column stats: COMPLETE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 1 keys: 0 _col1 (type: int) 1 _col0 (type: int) input vertices: 0 Map 4 Statistics: Num rows: 9223372036854775807 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE Select Operator Statistics: Num rows: 9223372036854775807 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE Group By Operator aggregations: count() mode: hash outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator sort order: Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col0 (type: bigint) Execution mode: vectorized Map 3 Map Operator Tree: TableScan alias: date_dim filterExpr: ((d_year = 2002) and d_date_sk is not null) (type: boolean) Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ((d_year = 2002) and d_date_sk is not null) (type: boolean) Statistics: Num rows: 652 Data size: 5216 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: d_date_sk (type: int) outputColumnNames: _col0 Statistics: Num rows: 652 Data size: 2608 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) Statistics: Num rows: 652 Data size: 2608 Basic stats: COMPLETE Column stats: COMPLETE Execution mode: vectorized Map 4 Map Operator Tree: TableScan alias: web_sales filterExpr: (ws_sold_date_sk is not null and ws_ship_mode_sk is not null) (type: boolean) Statistics: Num rows: 143966864 Data size: 19577477788 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (ws_sold_date_sk is not null and ws_ship_mode_sk is not null) (type: boolean) Statistics: Num rows: 143948856 Data size: 1151518824 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ws_sold_date_sk (type: int), ws_ship_mode_sk (type: int) outputColumnNames: _col0, _col1 Statistics: Num rows: 143948856 Data size: 1151518824 Basic stats: COMPLETE Column stats: COMPLETE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col1} 1 keys: 0 _col0 (type: int) 1 _col0 (type: int) outputColumnNames: _col1 input vertices: 1 Map 3 Statistics: Num rows: 1284818 Data size: 5139272 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col1 (type: int) sort order: + Map-reduce partition columns: _col1 (type: int) Statistics: Num rows: 1284818 Data size: 5139272 Basic stats: COMPLETE Column stats: COMPLETE Execution mode: vectorized Reducer 2 Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0) mode: mergepartial outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: _col0 (type: bigint) outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Execution mode: vectorized Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink {code} > Estimate number of rows for table with 0 rows overflows resulting in an > in-efficient plan > ------------------------------------------------------------------------------------------ > > Key: HIVE-8747 > URL: https://issues.apache.org/jira/browse/HIVE-8747 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer > Affects Versions: 0.14.0 > Reporter: Mostafa Mokhtar > Assignee: Prasanth J > Priority: Critical > Fix For: 0.14.0 > > > ship_mode table has 0 rows. > Query > {code} > select count(*) > from > web_sales > ,date_dim > ,ship_mode > where > web_sales.ws_sold_date_sk = date_dim.d_date_sk > and web_sales.ws_ship_mode_sk = ship_mode.sm_ship_mode_sk > and d_year = 2002 > and sm_carrier in ('DIAMOND','AIRBORNE') > {code} > Explain > {code} > STAGE PLANS: > Stage: Stage-1 > Tez > Edges: > Map 1 <- Map 4 (BROADCAST_EDGE) > Map 4 <- Map 3 (BROADCAST_EDGE) > Reducer 2 <- Map 1 (SIMPLE_EDGE) > DagName: mmokhtar_20141105180404_59e6fb65-529f-4eaa-9446-7f34d12bffac:30 > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: ship_mode > filterExpr: ((sm_carrier) IN ('DIAMOND', 'AIRBORNE') and > sm_ship_mode_sk is not null) (type: boolean) > Statistics: Num rows: 0 Data size: 45 Basic stats: PARTIAL > Column stats: COMPLETE > Filter Operator > predicate: ((sm_carrier) IN ('DIAMOND', 'AIRBORNE') and > sm_ship_mode_sk is not null) (type: boolean) > Statistics: Num rows: 9223372036854775807 Data size: > 9223372036854775807 Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: sm_ship_mode_sk (type: int) > outputColumnNames: _col0 > Statistics: Num rows: 9223372036854775807 Data size: > 9223372036854775807 Basic stats: COMPLETE Column stats: COMPLETE > Map Join Operator > condition map: > Inner Join 0 to 1 > condition expressions: > 0 > 1 > keys: > 0 _col1 (type: int) > 1 _col0 (type: int) > input vertices: > 0 Map 4 > Statistics: Num rows: 9223372036854775807 Data size: > 0 Basic stats: PARTIAL Column stats: COMPLETE > Select Operator > Statistics: Num rows: 9223372036854775807 Data > size: 0 Basic stats: PARTIAL Column stats: COMPLETE > Group By Operator > aggregations: count() > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE > Reduce Output Operator > sort order: > Statistics: Num rows: 1 Data size: 8 Basic > stats: COMPLETE Column stats: COMPLETE > value expressions: _col0 (type: bigint) > Execution mode: vectorized > Map 3 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2002) and d_date_sk is not null) > (type: boolean) > Statistics: Num rows: 73049 Data size: 81741831 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2002) and d_date_sk is not null) > (type: boolean) > Statistics: Num rows: 652 Data size: 5216 Basic stats: > COMPLETE Column stats: COMPLETE > Select Operator > expressions: d_date_sk (type: int) > outputColumnNames: _col0 > Statistics: Num rows: 652 Data size: 2608 Basic stats: > COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int) > sort order: + > Map-reduce partition columns: _col0 (type: int) > Statistics: Num rows: 652 Data size: 2608 Basic > stats: COMPLETE Column stats: COMPLETE > Execution mode: vectorized > Map 4 > Map Operator Tree: > TableScan > alias: web_sales > filterExpr: (ws_sold_date_sk is not null and > ws_ship_mode_sk is not null) (type: boolean) > Statistics: Num rows: 143966864 Data size: 19577477788 > Basic stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (ws_sold_date_sk is not null and > ws_ship_mode_sk is not null) (type: boolean) > Statistics: Num rows: 143948856 Data size: 1151518824 > Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: ws_sold_date_sk (type: int), > ws_ship_mode_sk (type: int) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 143948856 Data size: 1151518824 > Basic stats: COMPLETE Column stats: COMPLETE > Map Join Operator > condition map: > Inner Join 0 to 1 > condition expressions: > 0 {_col1} > 1 > keys: > 0 _col0 (type: int) > 1 _col0 (type: int) > outputColumnNames: _col1 > input vertices: > 1 Map 3 > Statistics: Num rows: 1284818 Data size: 5139272 > Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col1 (type: int) > sort order: + > Map-reduce partition columns: _col1 (type: int) > Statistics: Num rows: 1284818 Data size: 5139272 > Basic stats: COMPLETE Column stats: COMPLETE > Execution mode: vectorized > Reducer 2 > Reduce Operator Tree: > Group By Operator > aggregations: count(VALUE._col0) > mode: mergepartial > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE > Column stats: COMPLETE > Select Operator > expressions: _col0 (type: bigint) > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE > Column stats: COMPLETE > File Output Operator > compressed: false > Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE > table: > input format: org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > Execution mode: vectorized > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > ListSink > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)