[
https://issues.apache.org/jira/browse/HIVE-8260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14148573#comment-14148573
]
Gunther Hagleitner commented on HIVE-8260:
------------------------------------------
[~jpullokkaran] have you seen this before?
> CBO : Query query has date_dim d1,date_dim d2 and date_dim d3 but the explain
> has d1, d1 and d1
> ------------------------------------------------------------------------------------------------
>
> Key: HIVE-8260
> URL: https://issues.apache.org/jira/browse/HIVE-8260
> Project: Hive
> Issue Type: Bug
> Components: Physical Optimizer
> Affects Versions: 0.14.0
> Reporter: Mostafa Mokhtar
> Assignee: Laljo John Pullokkaran
> Fix For: 0.14.0
>
>
> For TPC-DS Q64 there is date_dim d1,date_dim d2 and date_dim d3 but the
> explain has d1, d1 and d1.
> This is a simplified version of query 64 that demonstrates the same issue :
> {code}
> select count(*)
> FROM store_sales
> JOIN store_returns ON store_sales.ss_item_sk =
> store_returns.sr_item_sk and store_sales.ss_ticket_number =
> store_returns.sr_ticket_number
> JOIN customer ON store_sales.ss_customer_sk = customer.c_customer_sk
> JOIN date_dim d1 ON store_sales.ss_sold_date_sk = d1.d_date_sk
> JOIN date_dim d2 ON customer.c_first_sales_date_sk = d2.d_date_sk
> JOIN date_dim d3 ON customer.c_first_shipto_date_sk = d3.d_date_sk
> JOIN store ON store_sales.ss_store_sk = store.s_store_sk
> JOIN customer_demographics cd1 ON store_sales.ss_cdemo_sk=
> cd1.cd_demo_sk
> JOIN customer_demographics cd2 ON customer.c_current_cdemo_sk =
> cd2.cd_demo_sk
> JOIN promotion ON store_sales.ss_promo_sk = promotion.p_promo_sk
> JOIN household_demographics hd1 ON store_sales.ss_hdemo_sk =
> hd1.hd_demo_sk
> JOIN household_demographics hd2 ON customer.c_current_hdemo_sk =
> hd2.hd_demo_sk
> JOIN customer_address ad1 ON store_sales.ss_addr_sk =
> ad1.ca_address_sk
> JOIN customer_address ad2 ON customer.c_current_addr_sk =
> ad2.ca_address_sk
> JOIN income_band ib1 ON hd1.hd_income_band_sk = ib1.ib_income_band_sk
> JOIN income_band ib2 ON hd2.hd_income_band_sk = ib2.ib_income_band_sk
> JOIN item ON store_sales.ss_item_sk = item.i_item_sk
> {code}
> The plan generated
> {code}
> STAGE PLANS:
> Stage: Stage-1
> Tez
> Edges:
> Map 13 <- Map 10 (BROADCAST_EDGE), Map 11 (BROADCAST_EDGE), Map 12
> (BROADCAST_EDGE), Map 15 (BROADCAST_EDGE), Map 16 (BROADCAST_EDGE), Map 18
> (BROADCAST_EDGE), Map 2 (BROADCAST_EDGE), Map 3 (BROADCAST_EDGE), Map 8
> (BROADCAST_EDGE)
> Map 16 <- Map 7 (BROADCAST_EDGE)
> Map 18 <- Map 1 (BROADCAST_EDGE), Map 17 (BROADCAST_EDGE), Map 4
> (BROADCAST_EDGE), Map 5 (BROADCAST_EDGE), Map 9 (BROADCAST_EDGE)
> Map 5 <- Map 6 (BROADCAST_EDGE)
> Reducer 14 <- Map 13 (SIMPLE_EDGE)
> DagName: mmokhtar_20140925180101_9c3b1d6b-61d3-44bc-a881-2beaf2ab143f:2
> Vertices:
> Map 1
> Map Operator Tree:
> TableScan
> alias: cd1
> filterExpr: cd_demo_sk is not null (type: boolean)
> Statistics: Num rows: 1920800 Data size: 718379200 Basic
> stats: COMPLETE Column stats: COMPLETE
> Filter Operator
> predicate: cd_demo_sk is not null (type: boolean)
> Statistics: Num rows: 1920800 Data size: 7683200 Basic
> stats: COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: cd_demo_sk (type: int)
> outputColumnNames: _col0
> Statistics: Num rows: 1920800 Data size: 7683200 Basic
> stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 1920800 Data size: 7683200
> Basic stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized
> Map 10
> Map Operator Tree:
> TableScan
> alias: item
> filterExpr: i_item_sk is not null (type: boolean)
> Statistics: Num rows: 48000 Data size: 68732712 Basic
> stats: COMPLETE Column stats: COMPLETE
> Filter Operator
> predicate: i_item_sk is not null (type: boolean)
> Statistics: Num rows: 48000 Data size: 192000 Basic
> stats: COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: i_item_sk (type: int)
> outputColumnNames: _col0
> Statistics: Num rows: 48000 Data size: 192000 Basic
> stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 48000 Data size: 192000 Basic
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized
> Map 11
> Map Operator Tree:
> TableScan
> alias: promotion
> filterExpr: p_promo_sk is not null (type: boolean)
> Statistics: Num rows: 450 Data size: 530848 Basic stats:
> COMPLETE Column stats: COMPLETE
> Filter Operator
> predicate: p_promo_sk is not null (type: boolean)
> Statistics: Num rows: 450 Data size: 1800 Basic stats:
> COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: p_promo_sk (type: int)
> outputColumnNames: _col0
> Statistics: Num rows: 450 Data size: 1800 Basic stats:
> COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 450 Data size: 1800 Basic
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized
> Map 12
> Map Operator Tree:
> TableScan
> alias: cd1
> filterExpr: cd_demo_sk is not null (type: boolean)
> Statistics: Num rows: 1920800 Data size: 718379200 Basic
> stats: COMPLETE Column stats: COMPLETE
> Filter Operator
> predicate: cd_demo_sk is not null (type: boolean)
> Statistics: Num rows: 1920800 Data size: 7683200 Basic
> stats: COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: cd_demo_sk (type: int)
> outputColumnNames: _col0
> Statistics: Num rows: 1920800 Data size: 7683200 Basic
> stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 1920800 Data size: 7683200
> Basic stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized
> Map 13
> Map Operator Tree:
> TableScan
> alias: store_sales
> filterExpr: ((((((((ss_hdemo_sk is not null and ss_item_sk
> is not null) and ss_cdemo_sk is not null) and ss_sold_date_sk is not null)
> and ss_addr_sk is not null) and ss_store_sk is not null) and ss_promo_sk is
> not null) and ss_customer_sk is not null) and ss_ticket_number is not null)
> (type: boolean)
> Statistics: Num rows: 550076554 Data size: 24008004411
> Basic stats: COMPLETE Column stats: COMPLETE
> Filter Operator
> predicate: ((((((((ss_hdemo_sk is not null and ss_item_sk
> is not null) and ss_cdemo_sk is not null) and ss_sold_date_sk is not null)
> and ss_addr_sk is not null) and ss_store_sk is not null) and ss_promo_sk is
> not null) and ss_customer_sk is not null) and ss_ticket_number is not null)
> (type: boolean)
> Statistics: Num rows: 476766966 Data size: 16894069044
> Basic stats: COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: ss_sold_date_sk (type: int), ss_item_sk
> (type: int), ss_customer_sk (type: int), ss_cdemo_sk (type: int), ss_hdemo_sk
> (type: int), ss_addr_sk (type: int), ss_store_sk (type: int), ss_promo_sk
> (type: int), ss_ticket_number (type: int)
> outputColumnNames: _col0, _col1, _col2, _col3, _col4,
> _col5, _col6, _col7, _col8
> Statistics: Num rows: 476766966 Data size: 16894069044
> Basic stats: COMPLETE Column stats: COMPLETE
> Map Join Operator
> condition map:
> Inner Join 0 to 1
> condition expressions:
> 0 {_col0} {_col1} {_col2} {_col3} {_col5} {_col6}
> {_col7} {_col8}
> 1
> keys:
> 0 _col4 (type: int)
> 1 _col0 (type: int)
> outputColumnNames: _col0, _col1, _col2, _col3, _col5,
> _col6, _col7, _col8
> input vertices:
> 1 Map 16
> Statistics: Num rows: 410166225 Data size:
> 13125319200 Basic stats: COMPLETE Column stats: COMPLETE
> Map Join Operator
> condition map:
> Inner Join 0 to 1
> condition expressions:
> 0 {_col0} {_col1} {_col2} {_col3} {_col5} {_col6}
> {_col7} {_col8}
> 1
> keys:
> 0 _col1 (type: int)
> 1 _col0 (type: int)
> outputColumnNames: _col0, _col1, _col2, _col3,
> _col5, _col6, _col7, _col8
> input vertices:
> 1 Map 10
> Statistics: Num rows: 314695482 Data size:
> 10070255424 Basic stats: COMPLETE Column stats: COMPLETE
> Map Join Operator
> condition map:
> Inner Join 0 to 1
> condition expressions:
> 0 {_col0} {_col1} {_col2} {_col5} {_col6}
> {_col7} {_col8}
> 1
> keys:
> 0 _col3 (type: int)
> 1 _col0 (type: int)
> outputColumnNames: _col0, _col1, _col2, _col5,
> _col6, _col7, _col8
> input vertices:
> 1 Map 12
> Statistics: Num rows: 329259309 Data size:
> 9219260652 Basic stats: COMPLETE Column stats: COMPLETE
> Map Join Operator
> condition map:
> Inner Join 0 to 1
> condition expressions:
> 0 {_col1} {_col2} {_col5} {_col6} {_col7}
> {_col8}
> 1
> keys:
> 0 _col0 (type: int)
> 1 _col0 (type: int)
> outputColumnNames: _col1, _col2, _col5, _col6,
> _col7, _col8
> input vertices:
> 1 Map 3
> Statistics: Num rows: 368151338 Data size:
> 8835632112 Basic stats: COMPLETE Column stats: COMPLETE
> Map Join Operator
> condition map:
> Inner Join 0 to 1
> condition expressions:
> 0 {_col1} {_col2} {_col6} {_col7} {_col8}
> 1
> keys:
> 0 _col5 (type: int)
> 1 _col0 (type: int)
> outputColumnNames: _col1, _col2, _col6,
> _col7, _col8
> input vertices:
> 1 Map 2
> Statistics: Num rows: 416100702 Data size:
> 8322014040 Basic stats: COMPLETE Column stats: COMPLETE
> Map Join Operator
> condition map:
> Inner Join 0 to 1
> condition expressions:
> 0 {_col1} {_col2} {_col7} {_col8}
> 1
> keys:
> 0 _col6 (type: int)
> 1 _col0 (type: int)
> outputColumnNames: _col1, _col2, _col7,
> _col8
> input vertices:
> 1 Map 15
> Statistics: Num rows: 512868307 Data size:
> 8205892912 Basic stats: COMPLETE Column stats: COMPLETE
> Map Join Operator
> condition map:
> Inner Join 0 to 1
> condition expressions:
> 0 {_col1} {_col2} {_col8}
> 1
> keys:
> 0 _col7 (type: int)
> 1 _col0 (type: int)
> outputColumnNames: _col1, _col2, _col8
> input vertices:
> 1 Map 11
> Statistics: Num rows: 1030315795 Data
> size: 12363789540 Basic stats: COMPLETE Column stats: COMPLETE
> Map Join Operator
> condition map:
> Inner Join 0 to 1
> condition expressions:
> 0 {_col1} {_col8}
> 1
> keys:
> 0 _col2 (type: int)
> 1 _col1 (type: int)
> outputColumnNames: _col1, _col8
> input vertices:
> 1 Map 18
> Statistics: Num rows: 2999114775 Data
> size: 23992918200 Basic stats: COMPLETE Column stats: COMPLETE
> Map Join Operator
> condition map:
> Inner Join 0 to 1
> condition expressions:
> 0
> 1
> keys:
> 0 _col1 (type: int), _col8 (type:
> int)
> 1 _col0 (type: int), _col1 (type:
> int)
> input vertices:
> 1 Map 8
> Statistics: Num rows: 60227570 Data
> size: 0 Basic stats: PARTIAL Column stats: COMPLETE
> Select Operator
> Statistics: Num rows: 60227570 Data
> size: 0 Basic stats: PARTIAL Column stats: COMPLETE
> Group By Operator
> aggregations: count()
> mode: hash
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data
> size: 8 Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> sort order:
> Statistics: Num rows: 1 Data
> size: 8 Basic stats: COMPLETE Column stats: COMPLETE
> value expressions: _col0 (type:
> bigint)
> Execution mode: vectorized
> Map 15
> Map Operator Tree:
> TableScan
> alias: store
> filterExpr: s_store_sk is not null (type: boolean)
> Statistics: Num rows: 212 Data size: 405680 Basic stats:
> COMPLETE Column stats: COMPLETE
> Filter Operator
> predicate: s_store_sk is not null (type: boolean)
> Statistics: Num rows: 212 Data size: 848 Basic stats:
> COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: s_store_sk (type: int)
> outputColumnNames: _col0
> Statistics: Num rows: 212 Data size: 848 Basic stats:
> COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 212 Data size: 848 Basic stats:
> COMPLETE Column stats: COMPLETE
> Execution mode: vectorized
> Map 16
> Map Operator Tree:
> TableScan
> alias: hd1
> filterExpr: (hd_income_band_sk is not null and hd_demo_sk
> is not null) (type: boolean)
> Statistics: Num rows: 7200 Data size: 799 Basic stats:
> COMPLETE Column stats: COMPLETE
> Filter Operator
> predicate: (hd_income_band_sk is not null and hd_demo_sk
> is not null) (type: boolean)
> Statistics: Num rows: 7200 Data size: 57600 Basic stats:
> COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: hd_demo_sk (type: int), hd_income_band_sk
> (type: int)
> outputColumnNames: _col0, _col1
> Statistics: Num rows: 7200 Data size: 57600 Basic
> stats: COMPLETE Column stats: COMPLETE
> Map Join Operator
> condition map:
> Inner Join 0 to 1
> condition expressions:
> 0 {_col0}
> 1
> keys:
> 0 _col1 (type: int)
> 1 _col0 (type: int)
> outputColumnNames: _col0
> input vertices:
> 1 Map 7
> Statistics: Num rows: 8000 Data size: 32000 Basic
> stats: COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: _col0 (type: int)
> outputColumnNames: _col0
> Statistics: Num rows: 8000 Data size: 32000 Basic
> stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 8000 Data size: 32000 Basic
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized
> Map 17
> Map Operator Tree:
> TableScan
> alias: d1
> filterExpr: d_date_sk is not null (type: boolean)
> Statistics: Num rows: 73049 Data size: 81741831 Basic
> stats: COMPLETE Column stats: COMPLETE
> Filter Operator
> predicate: d_date_sk is not null (type: boolean)
> Statistics: Num rows: 73049 Data size: 292196 Basic
> stats: COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: d_date_sk (type: int)
> outputColumnNames: _col0
> Statistics: Num rows: 73049 Data size: 292196 Basic
> stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 73049 Data size: 292196 Basic
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized
> Map 18
> Map Operator Tree:
> TableScan
> alias: customer
> filterExpr: (((((c_current_hdemo_sk is not null and
> c_current_cdemo_sk is not null) and c_first_sales_date_sk is not null) and
> c_first_shipto_date_sk is not null) and c_current_addr_sk is not null) and
> c_customer_sk is not null) (type: boolean)
> Statistics: Num rows: 1600000 Data size: 1376033128 Basic
> stats: COMPLETE Column stats: COMPLETE
> Filter Operator
> predicate: (((((c_current_hdemo_sk is not null and
> c_current_cdemo_sk is not null) and c_first_sales_date_sk is not null) and
> c_first_shipto_date_sk is not null) and c_current_addr_sk is not null) and
> c_customer_sk is not null) (type: boolean)
> Statistics: Num rows: 1387731 Data size: 32529348 Basic
> stats: COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: c_customer_sk (type: int),
> c_current_cdemo_sk (type: int), c_current_hdemo_sk (type: int),
> c_current_addr_sk (type: int), c_first_shipto_date_sk (type: int),
> c_first_sales_date_sk (type: int)
> outputColumnNames: _col0, _col1, _col2, _col3, _col4,
> _col5
> Statistics: Num rows: 1387731 Data size: 32529348 Basic
> stats: COMPLETE Column stats: COMPLETE
> Map Join Operator
> condition map:
> Inner Join 0 to 1
> condition expressions:
> 0 {_col0} {_col1} {_col3} {_col4} {_col5}
> 1
> keys:
> 0 _col2 (type: int)
> 1 _col0 (type: int)
> outputColumnNames: _col0, _col1, _col3, _col4, _col5
> input vertices:
> 1 Map 5
> Statistics: Num rows: 1193875 Data size: 23877500
> Basic stats: COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: _col0 (type: int), _col1 (type: int),
> _col3 (type: int), _col4 (type: int), _col5 (type: int)
> outputColumnNames: _col0, _col1, _col3, _col4, _col5
> Statistics: Num rows: 1193875 Data size: 23877500
> Basic stats: COMPLETE Column stats: COMPLETE
> Map Join Operator
> condition map:
> Inner Join 0 to 1
> condition expressions:
> 0
> 1 {_col0} {_col3} {_col4} {_col5}
> keys:
> 0 _col0 (type: int)
> 1 _col1 (type: int)
> outputColumnNames: _col1, _col4, _col5, _col6
> input vertices:
> 0 Map 1
> Statistics: Num rows: 2529344 Data size: 40469504
> Basic stats: COMPLETE Column stats: COMPLETE
> Map Join Operator
> condition map:
> Inner Join 0 to 1
> condition expressions:
> 0 {_col1} {_col4} {_col5}
> 1
> keys:
> 0 _col6 (type: int)
> 1 _col0 (type: int)
> outputColumnNames: _col1, _col4, _col5
> input vertices:
> 1 Map 17
> Statistics: Num rows: 2828109 Data size:
> 33937308 Basic stats: COMPLETE Column stats: COMPLETE
> Map Join Operator
> condition map:
> Inner Join 0 to 1
> condition expressions:
> 0 {_col1} {_col4}
> 1
> keys:
> 0 _col5 (type: int)
> 1 _col0 (type: int)
> outputColumnNames: _col1, _col4
> input vertices:
> 1 Map 4
> Statistics: Num rows: 3162164 Data size:
> 25297312 Basic stats: COMPLETE Column stats: COMPLETE
> Map Join Operator
> condition map:
> Inner Join 0 to 1
> condition expressions:
> 0 {_col1}
> 1
> keys:
> 0 _col4 (type: int)
> 1 _col0 (type: int)
> outputColumnNames: _col1
> input vertices:
> 1 Map 9
> Statistics: Num rows: 3574015 Data size:
> 14296060 Basic stats: COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: _col1 (type: int)
> outputColumnNames: _col1
> Statistics: Num rows: 3574015 Data size:
> 14296060 Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> key expressions: _col1 (type: int)
> sort order: +
> Map-reduce partition columns: _col1
> (type: int)
> Statistics: Num rows: 3574015 Data
> size: 14296060 Basic stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized
> Map 2
> Map Operator Tree:
> TableScan
> alias: ad1
> filterExpr: ca_address_sk is not null (type: boolean)
> Statistics: Num rows: 800000 Data size: 811903688 Basic
> stats: COMPLETE Column stats: COMPLETE
> Filter Operator
> predicate: ca_address_sk is not null (type: boolean)
> Statistics: Num rows: 800000 Data size: 3200000 Basic
> stats: COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: ca_address_sk (type: int)
> outputColumnNames: _col0
> Statistics: Num rows: 800000 Data size: 3200000 Basic
> stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 800000 Data size: 3200000 Basic
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized
> Map 3
> Map Operator Tree:
> TableScan
> alias: d1
> filterExpr: d_date_sk is not null (type: boolean)
> Statistics: Num rows: 73049 Data size: 81741831 Basic
> stats: COMPLETE Column stats: COMPLETE
> Filter Operator
> predicate: d_date_sk is not null (type: boolean)
> Statistics: Num rows: 73049 Data size: 292196 Basic
> stats: COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: d_date_sk (type: int)
> outputColumnNames: _col0
> Statistics: Num rows: 73049 Data size: 292196 Basic
> stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 73049 Data size: 292196 Basic
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized
> Map 4
> Map Operator Tree:
> TableScan
> alias: d1
> filterExpr: d_date_sk is not null (type: boolean)
> Statistics: Num rows: 73049 Data size: 81741831 Basic
> stats: COMPLETE Column stats: COMPLETE
> Filter Operator
> predicate: d_date_sk is not null (type: boolean)
> Statistics: Num rows: 73049 Data size: 292196 Basic
> stats: COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: d_date_sk (type: int)
> outputColumnNames: _col0
> Statistics: Num rows: 73049 Data size: 292196 Basic
> stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 73049 Data size: 292196 Basic
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized
> Map 5
> Map Operator Tree:
> TableScan
> alias: hd1
> filterExpr: (hd_income_band_sk is not null and hd_demo_sk
> is not null) (type: boolean)
> Statistics: Num rows: 7200 Data size: 799 Basic stats:
> COMPLETE Column stats: COMPLETE
> Filter Operator
> predicate: (hd_income_band_sk is not null and hd_demo_sk
> is not null) (type: boolean)
> Statistics: Num rows: 7200 Data size: 57600 Basic stats:
> COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: hd_demo_sk (type: int), hd_income_band_sk
> (type: int)
> outputColumnNames: _col0, _col1
> Statistics: Num rows: 7200 Data size: 57600 Basic
> stats: COMPLETE Column stats: COMPLETE
> Map Join Operator
> condition map:
> Inner Join 0 to 1
> condition expressions:
> 0 {_col0}
> 1
> keys:
> 0 _col1 (type: int)
> 1 _col0 (type: int)
> outputColumnNames: _col0
> input vertices:
> 1 Map 6
> Statistics: Num rows: 8000 Data size: 32000 Basic
> stats: COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: _col0 (type: int)
> outputColumnNames: _col0
> Statistics: Num rows: 8000 Data size: 32000 Basic
> stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 8000 Data size: 32000 Basic
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized
> Map 6
> Map Operator Tree:
> TableScan
> alias: ib1
> filterExpr: ib_income_band_sk is not null (type: boolean)
> Statistics: Num rows: 20 Data size: 240 Basic stats:
> COMPLETE Column stats: COMPLETE
> Filter Operator
> predicate: ib_income_band_sk is not null (type: boolean)
> Statistics: Num rows: 20 Data size: 80 Basic stats:
> COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: ib_income_band_sk (type: int)
> outputColumnNames: _col0
> Statistics: Num rows: 20 Data size: 80 Basic stats:
> COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 20 Data size: 80 Basic stats:
> COMPLETE Column stats: COMPLETE
> Execution mode: vectorized
> Map 7
> Map Operator Tree:
> TableScan
> alias: ib1
> filterExpr: ib_income_band_sk is not null (type: boolean)
> Statistics: Num rows: 20 Data size: 240 Basic stats:
> COMPLETE Column stats: COMPLETE
> Filter Operator
> predicate: ib_income_band_sk is not null (type: boolean)
> Statistics: Num rows: 20 Data size: 80 Basic stats:
> COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: ib_income_band_sk (type: int)
> outputColumnNames: _col0
> Statistics: Num rows: 20 Data size: 80 Basic stats:
> COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 20 Data size: 80 Basic stats:
> COMPLETE Column stats: COMPLETE
> Execution mode: vectorized
> Map 8
> Map Operator Tree:
> TableScan
> alias: store_returns
> filterExpr: (sr_item_sk is not null and sr_ticket_number is
> not null) (type: boolean)
> Statistics: Num rows: 55578005 Data size: 4377627636 Basic
> stats: COMPLETE Column stats: COMPLETE
> Filter Operator
> predicate: (sr_item_sk is not null and sr_ticket_number
> is not null) (type: boolean)
> Statistics: Num rows: 55578005 Data size: 444624040 Basic
> stats: COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: sr_item_sk (type: int), sr_ticket_number
> (type: int)
> outputColumnNames: _col0, _col1
> Statistics: Num rows: 55578005 Data size: 444624040
> Basic stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> key expressions: _col0 (type: int), _col1 (type: int)
> sort order: ++
> Map-reduce partition columns: _col0 (type: int),
> _col1 (type: int)
> Statistics: Num rows: 55578005 Data size: 444624040
> Basic stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized
> Map 9
> Map Operator Tree:
> TableScan
> alias: ad1
> filterExpr: ca_address_sk is not null (type: boolean)
> Statistics: Num rows: 800000 Data size: 811903688 Basic
> stats: COMPLETE Column stats: COMPLETE
> Filter Operator
> predicate: ca_address_sk is not null (type: boolean)
> Statistics: Num rows: 800000 Data size: 3200000 Basic
> stats: COMPLETE Column stats: COMPLETE
> Select Operator
> expressions: ca_address_sk (type: int)
> outputColumnNames: _col0
> Statistics: Num rows: 800000 Data size: 3200000 Basic
> stats: COMPLETE Column stats: COMPLETE
> Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 800000 Data size: 3200000 Basic
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized
> Reducer 14
> Reduce Operator Tree:
> Group By Operator
> aggregations: count(VALUE._col0)
> mode: mergepartial
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE
> Column stats: COMPLETE
> Select Operator
> expressions: _col0 (type: bigint)
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE
> Column stats: COMPLETE
> File Output Operator
> compressed: false
> Statistics: Num rows: 1 Data size: 8 Basic stats:
> COMPLETE Column stats: COMPLETE
> table:
> input format: org.apache.hadoop.mapred.TextInputFormat
> output format:
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> serde:
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> Execution mode: vectorized
> Stage: Stage-0
> Fetch Operator
> limit: -1
> Processor Tree:
> ListSink
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)