This is an automated email from the ASF dual-hosted git repository.
ngsg pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hive.git
The following commit(s) were added to refs/heads/master by this push:
new 73cc78dfdc4 HIVE-29161: Correct the row count computation affected by
Dynamic SemiJoin Reduction (#6041)
73cc78dfdc4 is described below
commit 73cc78dfdc4721efc4135c304c62d5b49bc412c4
Author: Seonggon Namgung <[email protected]>
AuthorDate: Tue Aug 26 12:20:23 2025 +0900
HIVE-29161: Correct the row count computation affected by Dynamic SemiJoin
Reduction (#6041)
---
.../apache/hadoop/hive/ql/parse/TezCompiler.java | 2 +-
.../llap/dynamic_semijoin_reduction_multicol.q.out | 59 +++-
.../clientpositive/perf/tpcds30tb/tez/query1.q.out | 230 ++++++++------
.../perf/tpcds30tb/tez/query1b.q.out | 300 ++++++++++--------
.../perf/tpcds30tb/tez/query24.q.out | 67 +++-
.../perf/tpcds30tb/tez/query64.q.out | 348 ++++++++++++---------
.../perf/tpcds30tb/tez/query80.q.out | 33 +-
7 files changed, 645 insertions(+), 394 deletions(-)
diff --git a/ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java
b/ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java
index 86088a8fcdc..2a947c5e0ee 100644
--- a/ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java
+++ b/ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java
@@ -1972,7 +1972,7 @@ private void
removeSemijoinOptimizationByBenefit(OptimizeTezProcContext procCtx)
for (SemijoinOperatorInfo roi : reductionFactorMap.values()) {
// This semijoin will be kept
// We are going to adjust the filter statistics
- long newNumRows = (long) (1.0 - roi.reductionFactor) *
roi.filterStats.getNumRows();
+ long newNumRows = (long) ((1.0 - roi.reductionFactor) *
roi.filterStats.getNumRows());
if (LOG.isDebugEnabled()) {
LOG.debug("Old stats for {}: {}", roi.filterOperator,
roi.filterStats);
LOG.debug("Number of rows reduction: {}/{}", newNumRows,
roi.filterStats.getNumRows());
diff --git
a/ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction_multicol.q.out
b/ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction_multicol.q.out
index c53fb23f10e..3cc24491fe7 100644
---
a/ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction_multicol.q.out
+++
b/ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction_multicol.q.out
@@ -194,20 +194,21 @@ STAGE PLANS:
Tez
#### A masked pattern was here ####
Edges:
- Map 1 <- Reducer 5 (BROADCAST_EDGE)
+ Map 1 <- Reducer 5 (BROADCAST_EDGE), Reducer 6 (BROADCAST_EDGE)
Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 4 (SIMPLE_EDGE)
Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
Reducer 5 <- Map 4 (CUSTOM_SIMPLE_EDGE)
+ Reducer 6 <- Map 4 (CUSTOM_SIMPLE_EDGE)
#### A masked pattern was here ####
Vertices:
Map 1
Map Operator Tree:
TableScan
alias: li
- filterExpr: (l_partkey is not null and l_suppkey is not null
and l_partkey BETWEEN DynamicValue(RS_7_ps_ps_partkey_min) AND
DynamicValue(RS_7_ps_ps_partkey_max) and in_bloom_filter(l_partkey,
DynamicValue(RS_7_ps_ps_partkey_bloom_filter))) (type: boolean)
+ filterExpr: (l_partkey is not null and l_suppkey is not null
and l_partkey BETWEEN DynamicValue(RS_7_ps_ps_partkey_min) AND
DynamicValue(RS_7_ps_ps_partkey_max) and l_suppkey BETWEEN
DynamicValue(RS_7_ps_ps_suppkey_min) AND DynamicValue(RS_7_ps_ps_suppkey_max)
and in_bloom_filter(l_partkey, DynamicValue(RS_7_ps_ps_partkey_bloom_filter))
and in_bloom_filter(l_suppkey, DynamicValue(RS_7_ps_ps_suppkey_bloom_filter)))
(type: boolean)
Statistics: Num rows: 6005 Data size: 72060 Basic stats:
COMPLETE Column stats: COMPLETE
Filter Operator
- predicate: (l_partkey is not null and l_suppkey is not
null and l_partkey BETWEEN DynamicValue(RS_7_ps_ps_partkey_min) AND
DynamicValue(RS_7_ps_ps_partkey_max) and in_bloom_filter(l_partkey,
DynamicValue(RS_7_ps_ps_partkey_bloom_filter))) (type: boolean)
+ predicate: (l_partkey is not null and l_suppkey is not
null and l_partkey BETWEEN DynamicValue(RS_7_ps_ps_partkey_min) AND
DynamicValue(RS_7_ps_ps_partkey_max) and l_suppkey BETWEEN
DynamicValue(RS_7_ps_ps_suppkey_min) AND DynamicValue(RS_7_ps_ps_suppkey_max)
and in_bloom_filter(l_partkey, DynamicValue(RS_7_ps_ps_partkey_bloom_filter))
and in_bloom_filter(l_suppkey, DynamicValue(RS_7_ps_ps_suppkey_bloom_filter)))
(type: boolean)
Statistics: Num rows: 6005 Data size: 72060 Basic stats:
COMPLETE Column stats: COMPLETE
Select Operator
expressions: l_orderkey (type: int), l_partkey (type:
int), l_suppkey (type: int)
@@ -257,6 +258,21 @@ STAGE PLANS:
sort order:
Statistics: Num rows: 1 Data size: 152 Basic
stats: COMPLETE Column stats: COMPLETE
value expressions: _col0 (type: int), _col1 (type:
int), _col2 (type: binary)
+ Select Operator
+ expressions: _col1 (type: int)
+ outputColumnNames: _col1
+ Statistics: Num rows: 1 Data size: 4 Basic stats:
COMPLETE Column stats: COMPLETE
+ Group By Operator
+ aggregations: min(_col1), max(_col1),
bloom_filter(_col1, expectedEntries=1000000)
+ minReductionHashAggr: 0.4
+ mode: hash
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 1 Data size: 152 Basic stats:
COMPLETE Column stats: COMPLETE
+ Reduce Output Operator
+ null sort order:
+ sort order:
+ Statistics: Num rows: 1 Data size: 152 Basic
stats: COMPLETE Column stats: COMPLETE
+ value expressions: _col0 (type: int), _col1 (type:
int), _col2 (type: binary)
Execution mode: vectorized, llap
LLAP IO: all inputs
Reducer 2
@@ -306,6 +322,19 @@ STAGE PLANS:
sort order:
Statistics: Num rows: 1 Data size: 152 Basic stats: COMPLETE
Column stats: COMPLETE
value expressions: _col0 (type: int), _col1 (type: int),
_col2 (type: binary)
+ Reducer 6
+ Execution mode: vectorized, llap
+ Reduce Operator Tree:
+ Group By Operator
+ aggregations: min(VALUE._col0), max(VALUE._col1),
bloom_filter(VALUE._col2, 1, expectedEntries=1000000)
+ mode: final
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 1 Data size: 152 Basic stats: COMPLETE
Column stats: COMPLETE
+ Reduce Output Operator
+ null sort order:
+ sort order:
+ Statistics: Num rows: 1 Data size: 152 Basic stats: COMPLETE
Column stats: COMPLETE
+ value expressions: _col0 (type: int), _col1 (type: int),
_col2 (type: binary)
Stage: Stage-0
Fetch Operator
@@ -329,27 +358,33 @@ Stage-1 HIVE COUNTERS:
RECORDS_IN_Map_1: 6005
RECORDS_IN_Map_4: 800
RECORDS_OUT_0: 14
- RECORDS_OUT_INTERMEDIATE_Map_1: 50
- RECORDS_OUT_INTERMEDIATE_Map_4: 3
+ RECORDS_OUT_INTERMEDIATE_Map_1: 14
+ RECORDS_OUT_INTERMEDIATE_Map_4: 4
RECORDS_OUT_INTERMEDIATE_Reducer_2: 14
RECORDS_OUT_INTERMEDIATE_Reducer_3: 0
RECORDS_OUT_INTERMEDIATE_Reducer_5: 1
+ RECORDS_OUT_INTERMEDIATE_Reducer_6: 1
RECORDS_OUT_OPERATOR_FIL_38: 2
- RECORDS_OUT_OPERATOR_FIL_46: 50
- RECORDS_OUT_OPERATOR_FS_50: 14
- RECORDS_OUT_OPERATOR_GBY_42: 1
+ RECORDS_OUT_OPERATOR_FIL_51: 14
+ RECORDS_OUT_OPERATOR_FS_55: 14
+ RECORDS_OUT_OPERATOR_GBY_43: 1
RECORDS_OUT_OPERATOR_GBY_44: 1
+ RECORDS_OUT_OPERATOR_GBY_47: 1
+ RECORDS_OUT_OPERATOR_GBY_49: 1
RECORDS_OUT_OPERATOR_MAP_0: 0
RECORDS_OUT_OPERATOR_MERGEJOIN_37: 14
RECORDS_OUT_OPERATOR_RS_10: 14
RECORDS_OUT_OPERATOR_RS_40: 2
- RECORDS_OUT_OPERATOR_RS_43: 1
RECORDS_OUT_OPERATOR_RS_45: 1
- RECORDS_OUT_OPERATOR_RS_48: 50
+ RECORDS_OUT_OPERATOR_RS_46: 1
+ RECORDS_OUT_OPERATOR_RS_48: 1
+ RECORDS_OUT_OPERATOR_RS_50: 1
+ RECORDS_OUT_OPERATOR_RS_53: 14
RECORDS_OUT_OPERATOR_SEL_39: 2
RECORDS_OUT_OPERATOR_SEL_41: 2
- RECORDS_OUT_OPERATOR_SEL_47: 50
- RECORDS_OUT_OPERATOR_SEL_49: 14
+ RECORDS_OUT_OPERATOR_SEL_42: 2
+ RECORDS_OUT_OPERATOR_SEL_52: 14
+ RECORDS_OUT_OPERATOR_SEL_54: 14
RECORDS_OUT_OPERATOR_SEL_9: 14
RECORDS_OUT_OPERATOR_TS_0: 6005
RECORDS_OUT_OPERATOR_TS_3: 800
diff --git a/ql/src/test/results/clientpositive/perf/tpcds30tb/tez/query1.q.out
b/ql/src/test/results/clientpositive/perf/tpcds30tb/tez/query1.q.out
index 8ef6ee587a1..509b613345f 100644
--- a/ql/src/test/results/clientpositive/perf/tpcds30tb/tez/query1.q.out
+++ b/ql/src/test/results/clientpositive/perf/tpcds30tb/tez/query1.q.out
@@ -7,24 +7,25 @@ STAGE PLANS:
Tez
#### A masked pattern was here ####
Edges:
- Map 1 <- Map 10 (BROADCAST_EDGE), Reducer 11 (BROADCAST_EDGE), Reducer
8 (BROADCAST_EDGE)
- Reducer 11 <- Map 10 (SIMPLE_EDGE)
- Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 7 (BROADCAST_EDGE)
- Reducer 3 <- Map 9 (CUSTOM_SIMPLE_EDGE), Reducer 2
(CUSTOM_SIMPLE_EDGE), Reducer 6 (BROADCAST_EDGE)
+ Map 1 <- Map 12 (BROADCAST_EDGE), Reducer 11 (BROADCAST_EDGE), Reducer
6 (BROADCAST_EDGE)
+ Map 8 <- Map 12 (BROADCAST_EDGE), Reducer 6 (BROADCAST_EDGE)
+ Reducer 10 <- Reducer 9 (SIMPLE_EDGE)
+ Reducer 11 <- Reducer 10 (CUSTOM_SIMPLE_EDGE)
+ Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 5 (BROADCAST_EDGE)
+ Reducer 3 <- Map 7 (CUSTOM_SIMPLE_EDGE), Reducer 10 (BROADCAST_EDGE),
Reducer 2 (CUSTOM_SIMPLE_EDGE)
Reducer 4 <- Reducer 3 (SIMPLE_EDGE)
- Reducer 5 <- Map 1 (SIMPLE_EDGE)
- Reducer 6 <- Reducer 5 (SIMPLE_EDGE)
- Reducer 8 <- Map 7 (CUSTOM_SIMPLE_EDGE)
+ Reducer 6 <- Map 5 (CUSTOM_SIMPLE_EDGE)
+ Reducer 9 <- Map 8 (SIMPLE_EDGE)
#### A masked pattern was here ####
Vertices:
Map 1
Map Operator Tree:
TableScan
alias: store_returns
- filterExpr: (((sr_store_sk is not null and sr_customer_sk is
not null) or sr_store_sk is not null) and sr_store_sk BETWEEN
DynamicValue(RS_41_store_s_store_sk_min) AND
DynamicValue(RS_41_store_s_store_sk_max) and in_bloom_filter(sr_store_sk,
DynamicValue(RS_41_store_s_store_sk_bloom_filter))) (type: boolean)
+ filterExpr: (sr_store_sk is not null and sr_customer_sk is
not null and sr_store_sk BETWEEN DynamicValue(RS_41_store_s_store_sk_min) AND
DynamicValue(RS_41_store_s_store_sk_max) and sr_store_sk BETWEEN
DynamicValue(RS_47_store_returns_sr_store_sk_min) AND
DynamicValue(RS_47_store_returns_sr_store_sk_max) and
in_bloom_filter(sr_store_sk, DynamicValue(RS_41_store_s_store_sk_bloom_filter))
and in_bloom_filter(sr_store_sk,
DynamicValue(RS_47_store_returns_sr_store_sk_bloom_ [...]
Statistics: Num rows: 8332595709 Data size: 1113890910776
Basic stats: COMPLETE Column stats: COMPLETE
Filter Operator
- predicate: (sr_store_sk is not null and sr_customer_sk is
not null and sr_store_sk BETWEEN DynamicValue(RS_41_store_s_store_sk_min) AND
DynamicValue(RS_41_store_s_store_sk_max) and in_bloom_filter(sr_store_sk,
DynamicValue(RS_41_store_s_store_sk_bloom_filter))) (type: boolean)
+ predicate: (sr_store_sk is not null and sr_customer_sk is
not null and sr_store_sk BETWEEN DynamicValue(RS_41_store_s_store_sk_min) AND
DynamicValue(RS_41_store_s_store_sk_max) and sr_store_sk BETWEEN
DynamicValue(RS_47_store_returns_sr_store_sk_min) AND
DynamicValue(RS_47_store_returns_sr_store_sk_max) and
in_bloom_filter(sr_store_sk, DynamicValue(RS_41_store_s_store_sk_bloom_filter))
and in_bloom_filter(sr_store_sk,
DynamicValue(RS_47_store_returns_sr_store_sk_bloom [...]
Statistics: Num rows: 8033148295 Data size: 1073861157208
Basic stats: COMPLETE Column stats: COMPLETE
Select Operator
expressions: sr_customer_sk (type: bigint), sr_store_sk
(type: bigint), sr_fee (type: decimal(7,2)), sr_returned_date_sk (type: bigint)
@@ -38,7 +39,7 @@ STAGE PLANS:
1 _col0 (type: bigint)
outputColumnNames: _col0, _col1, _col2
input vertices:
- 1 Map 10
+ 1 Map 12
Statistics: Num rows: 1472589806 Data size:
169844484256 Basic stats: COMPLETE Column stats: COMPLETE
Group By Operator
aggregations: sum(_col2)
@@ -54,40 +55,9 @@ STAGE PLANS:
Map-reduce partition columns: _col0 (type:
bigint), _col1 (type: bigint)
Statistics: Num rows: 1472589806 Data size:
186160875424 Basic stats: COMPLETE Column stats: COMPLETE
value expressions: _col2 (type: decimal(17,2))
- Filter Operator
- predicate: (sr_store_sk is not null and sr_store_sk
BETWEEN DynamicValue(RS_41_store_s_store_sk_min) AND
DynamicValue(RS_41_store_s_store_sk_max) and in_bloom_filter(sr_store_sk,
DynamicValue(RS_41_store_s_store_sk_bloom_filter))) (type: boolean)
- Statistics: Num rows: 8180935974 Data size: 1093617228248
Basic stats: COMPLETE Column stats: COMPLETE
- Select Operator
- expressions: sr_customer_sk (type: bigint), sr_store_sk
(type: bigint), sr_fee (type: decimal(7,2)), sr_returned_date_sk (type: bigint)
- outputColumnNames: _col0, _col1, _col2, _col3
- Statistics: Num rows: 8180935974 Data size:
1093617228248 Basic stats: COMPLETE Column stats: COMPLETE
- Map Join Operator
- condition map:
- Inner Join 0 to 1
- keys:
- 0 _col3 (type: bigint)
- 1 _col0 (type: bigint)
- outputColumnNames: _col0, _col1, _col2
- input vertices:
- 1 Reducer 11
- Statistics: Num rows: 1499681380 Data size:
172969152424 Basic stats: COMPLETE Column stats: COMPLETE
- Group By Operator
- aggregations: sum(_col2)
- keys: _col1 (type: bigint), _col0 (type: bigint)
- minReductionHashAggr: 0.87820673
- mode: hash
- outputColumnNames: _col0, _col1, _col2
- Statistics: Num rows: 1499681380 Data size:
189585719944 Basic stats: COMPLETE Column stats: COMPLETE
- Reduce Output Operator
- key expressions: _col0 (type: bigint), _col1
(type: bigint)
- null sort order: zz
- sort order: ++
- Map-reduce partition columns: _col0 (type:
bigint), _col1 (type: bigint)
- Statistics: Num rows: 1499681380 Data size:
189585719944 Basic stats: COMPLETE Column stats: COMPLETE
- value expressions: _col2 (type: decimal(17,2))
Execution mode: vectorized, llap
LLAP IO: may be used (ACID table)
- Map 10
+ Map 12
Map Operator Tree:
TableScan
alias: date_dim
@@ -106,6 +76,22 @@ STAGE PLANS:
sort order: +
Map-reduce partition columns: _col0 (type: bigint)
Statistics: Num rows: 367 Data size: 2936 Basic stats:
COMPLETE Column stats: COMPLETE
+ Select Operator
+ expressions: _col0 (type: bigint)
+ outputColumnNames: _col0
+ Statistics: Num rows: 367 Data size: 2936 Basic stats:
COMPLETE Column stats: COMPLETE
+ Group By Operator
+ keys: _col0 (type: bigint)
+ minReductionHashAggr: 0.4
+ mode: hash
+ outputColumnNames: _col0
+ Statistics: Num rows: 367 Data size: 2936 Basic
stats: COMPLETE Column stats: COMPLETE
+ Dynamic Partitioning Event Operator
+ Target column: sr_returned_date_sk (bigint)
+ Target Input: store_returns
+ Partition key expr: sr_returned_date_sk
+ Statistics: Num rows: 367 Data size: 2936 Basic
stats: COMPLETE Column stats: COMPLETE
+ Target Vertex: Map 8
Reduce Output Operator
key expressions: _col0 (type: bigint)
null sort order: z
@@ -130,7 +116,7 @@ STAGE PLANS:
Target Vertex: Map 1
Execution mode: vectorized, llap
LLAP IO: may be used (ACID table)
- Map 7
+ Map 5
Map Operator Tree:
TableScan
alias: store
@@ -166,7 +152,7 @@ STAGE PLANS:
value expressions: _col0 (type: bigint), _col1
(type: bigint), _col2 (type: binary)
Execution mode: vectorized, llap
LLAP IO: may be used (ACID table)
- Map 9
+ Map 7
Map Operator Tree:
TableScan
alias: customer
@@ -184,18 +170,96 @@ STAGE PLANS:
value expressions: _col1 (type: char(16))
Execution mode: vectorized, llap
LLAP IO: may be used (ACID table)
+ Map 8
+ Map Operator Tree:
+ TableScan
+ alias: store_returns
+ filterExpr: (sr_store_sk is not null and sr_store_sk BETWEEN
DynamicValue(RS_41_store_s_store_sk_min) AND
DynamicValue(RS_41_store_s_store_sk_max) and in_bloom_filter(sr_store_sk,
DynamicValue(RS_41_store_s_store_sk_bloom_filter))) (type: boolean)
+ Statistics: Num rows: 8332595709 Data size: 1113890910776
Basic stats: COMPLETE Column stats: COMPLETE
+ Filter Operator
+ predicate: (sr_store_sk is not null and sr_store_sk
BETWEEN DynamicValue(RS_41_store_s_store_sk_min) AND
DynamicValue(RS_41_store_s_store_sk_max) and in_bloom_filter(sr_store_sk,
DynamicValue(RS_41_store_s_store_sk_bloom_filter))) (type: boolean)
+ Statistics: Num rows: 8180935974 Data size: 1093617228248
Basic stats: COMPLETE Column stats: COMPLETE
+ Select Operator
+ expressions: sr_customer_sk (type: bigint), sr_store_sk
(type: bigint), sr_fee (type: decimal(7,2)), sr_returned_date_sk (type: bigint)
+ outputColumnNames: _col0, _col1, _col2, _col3
+ Statistics: Num rows: 8180935974 Data size:
1093617228248 Basic stats: COMPLETE Column stats: COMPLETE
+ Map Join Operator
+ condition map:
+ Inner Join 0 to 1
+ keys:
+ 0 _col3 (type: bigint)
+ 1 _col0 (type: bigint)
+ outputColumnNames: _col0, _col1, _col2
+ input vertices:
+ 1 Map 12
+ Statistics: Num rows: 1499681380 Data size:
172969152424 Basic stats: COMPLETE Column stats: COMPLETE
+ Group By Operator
+ aggregations: sum(_col2)
+ keys: _col1 (type: bigint), _col0 (type: bigint)
+ minReductionHashAggr: 0.87820673
+ mode: hash
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 1499681380 Data size:
189585719944 Basic stats: COMPLETE Column stats: COMPLETE
+ Reduce Output Operator
+ key expressions: _col0 (type: bigint), _col1
(type: bigint)
+ null sort order: zz
+ sort order: ++
+ Map-reduce partition columns: _col0 (type:
bigint), _col1 (type: bigint)
+ Statistics: Num rows: 1499681380 Data size:
189585719944 Basic stats: COMPLETE Column stats: COMPLETE
+ value expressions: _col2 (type: decimal(17,2))
+ Execution mode: vectorized, llap
+ LLAP IO: may be used (ACID table)
+ Reducer 10
+ Execution mode: vectorized, llap
+ Reduce Operator Tree:
+ Group By Operator
+ aggregations: sum(VALUE._col0), count(VALUE._col1)
+ keys: KEY._col0 (type: bigint)
+ mode: mergepartial
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 161 Data size: 20488 Basic stats:
COMPLETE Column stats: COMPLETE
+ Filter Operator
+ predicate: CAST( (_col1 / _col2) AS decimal(21,6)) is not
null (type: boolean)
+ Statistics: Num rows: 161 Data size: 20488 Basic stats:
COMPLETE Column stats: COMPLETE
+ Select Operator
+ expressions: (CAST( (_col1 / _col2) AS decimal(21,6)) *
1.2) (type: decimal(24,7)), _col0 (type: bigint)
+ outputColumnNames: _col0, _col1
+ Statistics: Num rows: 161 Data size: 19200 Basic stats:
COMPLETE Column stats: COMPLETE
+ Reduce Output Operator
+ key expressions: _col1 (type: bigint)
+ null sort order: z
+ sort order: +
+ Map-reduce partition columns: _col1 (type: bigint)
+ Statistics: Num rows: 161 Data size: 19200 Basic stats:
COMPLETE Column stats: COMPLETE
+ value expressions: _col0 (type: decimal(24,7))
+ Select Operator
+ expressions: _col1 (type: bigint)
+ outputColumnNames: _col1
+ Statistics: Num rows: 161 Data size: 1168 Basic stats:
COMPLETE Column stats: COMPLETE
+ Group By Operator
+ aggregations: min(_col1), max(_col1),
bloom_filter(_col1, expectedEntries=1000000)
+ minReductionHashAggr: 0.99
+ mode: hash
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 1 Data size: 160 Basic stats:
COMPLETE Column stats: COMPLETE
+ Reduce Output Operator
+ null sort order:
+ sort order:
+ Statistics: Num rows: 1 Data size: 160 Basic stats:
COMPLETE Column stats: COMPLETE
+ value expressions: _col0 (type: bigint), _col1
(type: bigint), _col2 (type: binary)
Reducer 11
Execution mode: vectorized, llap
Reduce Operator Tree:
- Select Operator
- expressions: KEY.reducesinkkey0 (type: bigint)
- outputColumnNames: _col0
+ Group By Operator
+ aggregations: min(VALUE._col0), max(VALUE._col1),
bloom_filter(VALUE._col2, 1, expectedEntries=1000000)
+ mode: final
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 1 Data size: 160 Basic stats: COMPLETE
Column stats: COMPLETE
Reduce Output Operator
- key expressions: _col0 (type: bigint)
- null sort order: z
- sort order: +
- Map-reduce partition columns: _col0 (type: bigint)
- Statistics: Num rows: 367 Data size: 2936 Basic stats:
COMPLETE Column stats: COMPLETE
+ null sort order:
+ sort order:
+ Statistics: Num rows: 1 Data size: 160 Basic stats: COMPLETE
Column stats: COMPLETE
+ value expressions: _col0 (type: bigint), _col1 (type:
bigint), _col2 (type: binary)
Reducer 2
Execution mode: vectorized, llap
Reduce Operator Tree:
@@ -220,7 +284,7 @@ STAGE PLANS:
1 _col0 (type: bigint)
outputColumnNames: _col0, _col1, _col2
input vertices:
- 1 Map 7
+ 1 Map 5
Statistics: Num rows: 33743267 Data size: 3779245920
Basic stats: COMPLETE Column stats: COMPLETE
Reduce Output Operator
key expressions: _col0 (type: bigint)
@@ -240,7 +304,7 @@ STAGE PLANS:
1 KEY.reducesinkkey0 (type: bigint)
outputColumnNames: _col1, _col2, _col5
input vertices:
- 1 Map 9
+ 1 Map 7
Statistics: Num rows: 33743267 Data size: 7153572612 Basic
stats: COMPLETE Column stats: COMPLETE
DynamicPartitionHashJoin: true
Map Join Operator
@@ -251,7 +315,7 @@ STAGE PLANS:
1 _col1 (type: bigint)
outputColumnNames: _col2, _col5, _col6
input vertices:
- 1 Reducer 6
+ 1 Reducer 10
Statistics: Num rows: 33954162 Data size: 11001148488 Basic
stats: COMPLETE Column stats: COMPLETE
Filter Operator
predicate: (_col2 > _col6) (type: boolean)
@@ -288,7 +352,25 @@ STAGE PLANS:
input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde:
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- Reducer 5
+ Reducer 6
+ Execution mode: vectorized, llap
+ Reduce Operator Tree:
+ Group By Operator
+ aggregations: min(VALUE._col0), max(VALUE._col1),
bloom_filter(VALUE._col2, 1, expectedEntries=1000000)
+ mode: final
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 1 Data size: 160 Basic stats: COMPLETE
Column stats: COMPLETE
+ Reduce Output Operator
+ null sort order:
+ sort order:
+ Statistics: Num rows: 1 Data size: 160 Basic stats: COMPLETE
Column stats: COMPLETE
+ value expressions: _col0 (type: bigint), _col1 (type:
bigint), _col2 (type: binary)
+ Reduce Output Operator
+ null sort order:
+ sort order:
+ Statistics: Num rows: 1 Data size: 160 Basic stats: COMPLETE
Column stats: COMPLETE
+ value expressions: _col0 (type: bigint), _col1 (type:
bigint), _col2 (type: binary)
+ Reducer 9
Execution mode: vectorized, llap
Reduce Operator Tree:
Group By Operator
@@ -315,42 +397,6 @@ STAGE PLANS:
Map-reduce partition columns: _col0 (type: bigint)
Statistics: Num rows: 119301 Data size: 15175776 Basic
stats: COMPLETE Column stats: COMPLETE
value expressions: _col1 (type: decimal(27,2)), _col2
(type: bigint)
- Reducer 6
- Execution mode: vectorized, llap
- Reduce Operator Tree:
- Group By Operator
- aggregations: sum(VALUE._col0), count(VALUE._col1)
- keys: KEY._col0 (type: bigint)
- mode: mergepartial
- outputColumnNames: _col0, _col1, _col2
- Statistics: Num rows: 161 Data size: 20488 Basic stats:
COMPLETE Column stats: COMPLETE
- Filter Operator
- predicate: CAST( (_col1 / _col2) AS decimal(21,6)) is not
null (type: boolean)
- Statistics: Num rows: 161 Data size: 20488 Basic stats:
COMPLETE Column stats: COMPLETE
- Select Operator
- expressions: (CAST( (_col1 / _col2) AS decimal(21,6)) *
1.2) (type: decimal(24,7)), _col0 (type: bigint)
- outputColumnNames: _col0, _col1
- Statistics: Num rows: 161 Data size: 19200 Basic stats:
COMPLETE Column stats: COMPLETE
- Reduce Output Operator
- key expressions: _col1 (type: bigint)
- null sort order: z
- sort order: +
- Map-reduce partition columns: _col1 (type: bigint)
- Statistics: Num rows: 161 Data size: 19200 Basic stats:
COMPLETE Column stats: COMPLETE
- value expressions: _col0 (type: decimal(24,7))
- Reducer 8
- Execution mode: vectorized, llap
- Reduce Operator Tree:
- Group By Operator
- aggregations: min(VALUE._col0), max(VALUE._col1),
bloom_filter(VALUE._col2, 1, expectedEntries=1000000)
- mode: final
- outputColumnNames: _col0, _col1, _col2
- Statistics: Num rows: 1 Data size: 160 Basic stats: COMPLETE
Column stats: COMPLETE
- Reduce Output Operator
- null sort order:
- sort order:
- Statistics: Num rows: 1 Data size: 160 Basic stats: COMPLETE
Column stats: COMPLETE
- value expressions: _col0 (type: bigint), _col1 (type:
bigint), _col2 (type: binary)
Stage: Stage-0
Fetch Operator
diff --git
a/ql/src/test/results/clientpositive/perf/tpcds30tb/tez/query1b.q.out
b/ql/src/test/results/clientpositive/perf/tpcds30tb/tez/query1b.q.out
index ceb1f018abe..93081e70ebc 100644
--- a/ql/src/test/results/clientpositive/perf/tpcds30tb/tez/query1b.q.out
+++ b/ql/src/test/results/clientpositive/perf/tpcds30tb/tez/query1b.q.out
@@ -7,26 +7,27 @@ STAGE PLANS:
Tez
#### A masked pattern was here ####
Edges:
- Map 1 <- Map 11 (BROADCAST_EDGE), Reducer 12 (BROADCAST_EDGE), Reducer
9 (BROADCAST_EDGE)
- Map 10 <- Reducer 5 (BROADCAST_EDGE)
- Reducer 12 <- Map 11 (SIMPLE_EDGE)
- Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 8 (BROADCAST_EDGE)
- Reducer 3 <- Map 10 (CUSTOM_SIMPLE_EDGE), Reducer 2
(CUSTOM_SIMPLE_EDGE), Reducer 7 (BROADCAST_EDGE)
+ Map 1 <- Map 13 (BROADCAST_EDGE), Reducer 12 (BROADCAST_EDGE), Reducer
7 (BROADCAST_EDGE)
+ Map 8 <- Reducer 5 (BROADCAST_EDGE)
+ Map 9 <- Map 13 (BROADCAST_EDGE), Reducer 7 (BROADCAST_EDGE)
+ Reducer 10 <- Map 9 (SIMPLE_EDGE)
+ Reducer 11 <- Reducer 10 (SIMPLE_EDGE)
+ Reducer 12 <- Reducer 11 (CUSTOM_SIMPLE_EDGE)
+ Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 6 (BROADCAST_EDGE)
+ Reducer 3 <- Map 8 (CUSTOM_SIMPLE_EDGE), Reducer 11 (BROADCAST_EDGE),
Reducer 2 (CUSTOM_SIMPLE_EDGE)
Reducer 4 <- Reducer 3 (SIMPLE_EDGE)
Reducer 5 <- Reducer 2 (CUSTOM_SIMPLE_EDGE)
- Reducer 6 <- Map 1 (SIMPLE_EDGE)
- Reducer 7 <- Reducer 6 (SIMPLE_EDGE)
- Reducer 9 <- Map 8 (CUSTOM_SIMPLE_EDGE)
+ Reducer 7 <- Map 6 (CUSTOM_SIMPLE_EDGE)
#### A masked pattern was here ####
Vertices:
Map 1
Map Operator Tree:
TableScan
alias: store_returns
- filterExpr: (((sr_store_sk is not null and sr_customer_sk is
not null) or sr_store_sk is not null) and sr_store_sk BETWEEN
DynamicValue(RS_41_store_s_store_sk_min) AND
DynamicValue(RS_41_store_s_store_sk_max) and in_bloom_filter(sr_store_sk,
DynamicValue(RS_41_store_s_store_sk_bloom_filter))) (type: boolean)
+ filterExpr: (sr_store_sk is not null and sr_customer_sk is
not null and sr_store_sk BETWEEN DynamicValue(RS_41_store_s_store_sk_min) AND
DynamicValue(RS_41_store_s_store_sk_max) and sr_store_sk BETWEEN
DynamicValue(RS_47_store_returns_sr_store_sk_min) AND
DynamicValue(RS_47_store_returns_sr_store_sk_max) and
in_bloom_filter(sr_store_sk, DynamicValue(RS_41_store_s_store_sk_bloom_filter))
and in_bloom_filter(sr_store_sk,
DynamicValue(RS_47_store_returns_sr_store_sk_bloom_ [...]
Statistics: Num rows: 8332595709 Data size: 1113890910776
Basic stats: COMPLETE Column stats: COMPLETE
Filter Operator
- predicate: (sr_store_sk is not null and sr_customer_sk is
not null and sr_store_sk BETWEEN DynamicValue(RS_41_store_s_store_sk_min) AND
DynamicValue(RS_41_store_s_store_sk_max) and in_bloom_filter(sr_store_sk,
DynamicValue(RS_41_store_s_store_sk_bloom_filter))) (type: boolean)
+ predicate: (sr_store_sk is not null and sr_customer_sk is
not null and sr_store_sk BETWEEN DynamicValue(RS_41_store_s_store_sk_min) AND
DynamicValue(RS_41_store_s_store_sk_max) and sr_store_sk BETWEEN
DynamicValue(RS_47_store_returns_sr_store_sk_min) AND
DynamicValue(RS_47_store_returns_sr_store_sk_max) and
in_bloom_filter(sr_store_sk, DynamicValue(RS_41_store_s_store_sk_bloom_filter))
and in_bloom_filter(sr_store_sk,
DynamicValue(RS_47_store_returns_sr_store_sk_bloom [...]
Statistics: Num rows: 8033148295 Data size: 1073861157208
Basic stats: COMPLETE Column stats: COMPLETE
Select Operator
expressions: sr_customer_sk (type: bigint), sr_store_sk
(type: bigint), sr_fee (type: decimal(7,2)), sr_returned_date_sk (type: bigint)
@@ -40,7 +41,7 @@ STAGE PLANS:
1 _col0 (type: bigint)
outputColumnNames: _col0, _col1, _col2
input vertices:
- 1 Map 11
+ 1 Map 13
Statistics: Num rows: 1472589806 Data size:
169844484256 Basic stats: COMPLETE Column stats: COMPLETE
Group By Operator
aggregations: sum(_col2)
@@ -56,62 +57,9 @@ STAGE PLANS:
Map-reduce partition columns: _col0 (type:
bigint), _col1 (type: bigint)
Statistics: Num rows: 1472589806 Data size:
186160875424 Basic stats: COMPLETE Column stats: COMPLETE
value expressions: _col2 (type: decimal(17,2))
- Filter Operator
- predicate: (sr_store_sk is not null and sr_store_sk
BETWEEN DynamicValue(RS_41_store_s_store_sk_min) AND
DynamicValue(RS_41_store_s_store_sk_max) and in_bloom_filter(sr_store_sk,
DynamicValue(RS_41_store_s_store_sk_bloom_filter))) (type: boolean)
- Statistics: Num rows: 8180935974 Data size: 1093617228248
Basic stats: COMPLETE Column stats: COMPLETE
- Select Operator
- expressions: sr_customer_sk (type: bigint), sr_store_sk
(type: bigint), sr_fee (type: decimal(7,2)), sr_returned_date_sk (type: bigint)
- outputColumnNames: _col0, _col1, _col2, _col3
- Statistics: Num rows: 8180935974 Data size:
1093617228248 Basic stats: COMPLETE Column stats: COMPLETE
- Map Join Operator
- condition map:
- Inner Join 0 to 1
- keys:
- 0 _col3 (type: bigint)
- 1 _col0 (type: bigint)
- outputColumnNames: _col0, _col1, _col2
- input vertices:
- 1 Reducer 12
- Statistics: Num rows: 1499681380 Data size:
172969152424 Basic stats: COMPLETE Column stats: COMPLETE
- Group By Operator
- aggregations: sum(_col2)
- keys: _col1 (type: bigint), _col0 (type: bigint)
- minReductionHashAggr: 0.87820673
- mode: hash
- outputColumnNames: _col0, _col1, _col2
- Statistics: Num rows: 1499681380 Data size:
189585719944 Basic stats: COMPLETE Column stats: COMPLETE
- Reduce Output Operator
- key expressions: _col0 (type: bigint), _col1
(type: bigint)
- null sort order: zz
- sort order: ++
- Map-reduce partition columns: _col0 (type:
bigint), _col1 (type: bigint)
- Statistics: Num rows: 1499681380 Data size:
189585719944 Basic stats: COMPLETE Column stats: COMPLETE
- value expressions: _col2 (type: decimal(17,2))
Execution mode: vectorized, llap
LLAP IO: may be used (ACID table)
- Map 10
- Map Operator Tree:
- TableScan
- alias: customer
- filterExpr: (c_customer_sk BETWEEN
DynamicValue(RS_43_store_returns_sr_customer_sk_min) AND
DynamicValue(RS_43_store_returns_sr_customer_sk_max) and
in_bloom_filter(c_customer_sk,
DynamicValue(RS_43_store_returns_sr_customer_sk_bloom_filter))) (type: boolean)
- Statistics: Num rows: 80000000 Data size: 8640000000 Basic
stats: COMPLETE Column stats: COMPLETE
- Filter Operator
- predicate: (c_customer_sk BETWEEN
DynamicValue(RS_43_store_returns_sr_customer_sk_min) AND
DynamicValue(RS_43_store_returns_sr_customer_sk_max) and
in_bloom_filter(c_customer_sk,
DynamicValue(RS_43_store_returns_sr_customer_sk_bloom_filter))) (type: boolean)
- Statistics: Num rows: 80000000 Data size: 8640000000 Basic
stats: COMPLETE Column stats: COMPLETE
- Select Operator
- expressions: c_customer_sk (type: bigint), c_customer_id
(type: char(16))
- outputColumnNames: _col0, _col1
- Statistics: Num rows: 80000000 Data size: 8640000000
Basic stats: COMPLETE Column stats: COMPLETE
- Reduce Output Operator
- key expressions: _col0 (type: bigint)
- null sort order: z
- sort order: +
- Map-reduce partition columns: _col0 (type: bigint)
- Statistics: Num rows: 80000000 Data size: 8640000000
Basic stats: COMPLETE Column stats: COMPLETE
- value expressions: _col1 (type: char(16))
- Execution mode: vectorized, llap
- LLAP IO: may be used (ACID table)
- Map 11
+ Map 13
Map Operator Tree:
TableScan
alias: date_dim
@@ -130,6 +78,22 @@ STAGE PLANS:
sort order: +
Map-reduce partition columns: _col0 (type: bigint)
Statistics: Num rows: 367 Data size: 2936 Basic stats:
COMPLETE Column stats: COMPLETE
+ Select Operator
+ expressions: _col0 (type: bigint)
+ outputColumnNames: _col0
+ Statistics: Num rows: 367 Data size: 2936 Basic stats:
COMPLETE Column stats: COMPLETE
+ Group By Operator
+ keys: _col0 (type: bigint)
+ minReductionHashAggr: 0.4
+ mode: hash
+ outputColumnNames: _col0
+ Statistics: Num rows: 367 Data size: 2936 Basic
stats: COMPLETE Column stats: COMPLETE
+ Dynamic Partitioning Event Operator
+ Target column: sr_returned_date_sk (bigint)
+ Target Input: store_returns
+ Partition key expr: sr_returned_date_sk
+ Statistics: Num rows: 367 Data size: 2936 Basic
stats: COMPLETE Column stats: COMPLETE
+ Target Vertex: Map 9
Reduce Output Operator
key expressions: _col0 (type: bigint)
null sort order: z
@@ -154,7 +118,7 @@ STAGE PLANS:
Target Vertex: Map 1
Execution mode: vectorized, llap
LLAP IO: may be used (ACID table)
- Map 8
+ Map 6
Map Operator Tree:
TableScan
alias: store
@@ -190,18 +154,145 @@ STAGE PLANS:
value expressions: _col0 (type: bigint), _col1
(type: bigint), _col2 (type: binary)
Execution mode: vectorized, llap
LLAP IO: may be used (ACID table)
+ Map 8
+ Map Operator Tree:
+ TableScan
+ alias: customer
+ filterExpr: (c_customer_sk BETWEEN
DynamicValue(RS_43_store_returns_sr_customer_sk_min) AND
DynamicValue(RS_43_store_returns_sr_customer_sk_max) and
in_bloom_filter(c_customer_sk,
DynamicValue(RS_43_store_returns_sr_customer_sk_bloom_filter))) (type: boolean)
+ Statistics: Num rows: 80000000 Data size: 8640000000 Basic
stats: COMPLETE Column stats: COMPLETE
+ Filter Operator
+ predicate: (c_customer_sk BETWEEN
DynamicValue(RS_43_store_returns_sr_customer_sk_min) AND
DynamicValue(RS_43_store_returns_sr_customer_sk_max) and
in_bloom_filter(c_customer_sk,
DynamicValue(RS_43_store_returns_sr_customer_sk_bloom_filter))) (type: boolean)
+ Statistics: Num rows: 80000000 Data size: 8640000000 Basic
stats: COMPLETE Column stats: COMPLETE
+ Select Operator
+ expressions: c_customer_sk (type: bigint), c_customer_id
(type: char(16))
+ outputColumnNames: _col0, _col1
+ Statistics: Num rows: 80000000 Data size: 8640000000
Basic stats: COMPLETE Column stats: COMPLETE
+ Reduce Output Operator
+ key expressions: _col0 (type: bigint)
+ null sort order: z
+ sort order: +
+ Map-reduce partition columns: _col0 (type: bigint)
+ Statistics: Num rows: 80000000 Data size: 8640000000
Basic stats: COMPLETE Column stats: COMPLETE
+ value expressions: _col1 (type: char(16))
+ Execution mode: vectorized, llap
+ LLAP IO: may be used (ACID table)
+ Map 9
+ Map Operator Tree:
+ TableScan
+ alias: store_returns
+ filterExpr: (sr_store_sk is not null and sr_store_sk BETWEEN
DynamicValue(RS_41_store_s_store_sk_min) AND
DynamicValue(RS_41_store_s_store_sk_max) and in_bloom_filter(sr_store_sk,
DynamicValue(RS_41_store_s_store_sk_bloom_filter))) (type: boolean)
+ Statistics: Num rows: 8332595709 Data size: 1113890910776
Basic stats: COMPLETE Column stats: COMPLETE
+ Filter Operator
+ predicate: (sr_store_sk is not null and sr_store_sk
BETWEEN DynamicValue(RS_41_store_s_store_sk_min) AND
DynamicValue(RS_41_store_s_store_sk_max) and in_bloom_filter(sr_store_sk,
DynamicValue(RS_41_store_s_store_sk_bloom_filter))) (type: boolean)
+ Statistics: Num rows: 8180935974 Data size: 1093617228248
Basic stats: COMPLETE Column stats: COMPLETE
+ Select Operator
+ expressions: sr_customer_sk (type: bigint), sr_store_sk
(type: bigint), sr_fee (type: decimal(7,2)), sr_returned_date_sk (type: bigint)
+ outputColumnNames: _col0, _col1, _col2, _col3
+ Statistics: Num rows: 8180935974 Data size:
1093617228248 Basic stats: COMPLETE Column stats: COMPLETE
+ Map Join Operator
+ condition map:
+ Inner Join 0 to 1
+ keys:
+ 0 _col3 (type: bigint)
+ 1 _col0 (type: bigint)
+ outputColumnNames: _col0, _col1, _col2
+ input vertices:
+ 1 Map 13
+ Statistics: Num rows: 1499681380 Data size:
172969152424 Basic stats: COMPLETE Column stats: COMPLETE
+ Group By Operator
+ aggregations: sum(_col2)
+ keys: _col1 (type: bigint), _col0 (type: bigint)
+ minReductionHashAggr: 0.87820673
+ mode: hash
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 1499681380 Data size:
189585719944 Basic stats: COMPLETE Column stats: COMPLETE
+ Reduce Output Operator
+ key expressions: _col0 (type: bigint), _col1
(type: bigint)
+ null sort order: zz
+ sort order: ++
+ Map-reduce partition columns: _col0 (type:
bigint), _col1 (type: bigint)
+ Statistics: Num rows: 1499681380 Data size:
189585719944 Basic stats: COMPLETE Column stats: COMPLETE
+ value expressions: _col2 (type: decimal(17,2))
+ Execution mode: vectorized, llap
+ LLAP IO: may be used (ACID table)
+ Reducer 10
+ Execution mode: vectorized, llap
+ Reduce Operator Tree:
+ Group By Operator
+ aggregations: sum(VALUE._col0)
+ keys: KEY._col0 (type: bigint), KEY._col1 (type: bigint)
+ mode: mergepartial
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 1499681380 Data size: 189585719944 Basic
stats: COMPLETE Column stats: COMPLETE
+ Select Operator
+ expressions: _col0 (type: bigint), _col2 (type:
decimal(17,2))
+ outputColumnNames: _col1, _col2
+ Statistics: Num rows: 1499681380 Data size: 189585719944
Basic stats: COMPLETE Column stats: COMPLETE
+ Group By Operator
+ aggregations: sum(_col2), count(_col2)
+ keys: _col1 (type: bigint)
+ minReductionHashAggr: 0.99
+ mode: hash
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 119301 Data size: 15175776 Basic
stats: COMPLETE Column stats: COMPLETE
+ Reduce Output Operator
+ key expressions: _col0 (type: bigint)
+ null sort order: z
+ sort order: +
+ Map-reduce partition columns: _col0 (type: bigint)
+ Statistics: Num rows: 119301 Data size: 15175776 Basic
stats: COMPLETE Column stats: COMPLETE
+ value expressions: _col1 (type: decimal(27,2)), _col2
(type: bigint)
+ Reducer 11
+ Execution mode: vectorized, llap
+ Reduce Operator Tree:
+ Group By Operator
+ aggregations: sum(VALUE._col0), count(VALUE._col1)
+ keys: KEY._col0 (type: bigint)
+ mode: mergepartial
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 161 Data size: 20488 Basic stats:
COMPLETE Column stats: COMPLETE
+ Filter Operator
+ predicate: CAST( (_col1 / _col2) AS decimal(21,6)) is not
null (type: boolean)
+ Statistics: Num rows: 161 Data size: 20488 Basic stats:
COMPLETE Column stats: COMPLETE
+ Select Operator
+ expressions: (CAST( (_col1 / _col2) AS decimal(21,6)) *
1.2) (type: decimal(24,7)), _col0 (type: bigint)
+ outputColumnNames: _col0, _col1
+ Statistics: Num rows: 161 Data size: 19200 Basic stats:
COMPLETE Column stats: COMPLETE
+ Reduce Output Operator
+ key expressions: _col1 (type: bigint)
+ null sort order: z
+ sort order: +
+ Map-reduce partition columns: _col1 (type: bigint)
+ Statistics: Num rows: 161 Data size: 19200 Basic stats:
COMPLETE Column stats: COMPLETE
+ value expressions: _col0 (type: decimal(24,7))
+ Select Operator
+ expressions: _col1 (type: bigint)
+ outputColumnNames: _col1
+ Statistics: Num rows: 161 Data size: 1168 Basic stats:
COMPLETE Column stats: COMPLETE
+ Group By Operator
+ aggregations: min(_col1), max(_col1),
bloom_filter(_col1, expectedEntries=160)
+ minReductionHashAggr: 0.99
+ mode: hash
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 1 Data size: 160 Basic stats:
COMPLETE Column stats: COMPLETE
+ Reduce Output Operator
+ null sort order:
+ sort order:
+ Statistics: Num rows: 1 Data size: 160 Basic stats:
COMPLETE Column stats: COMPLETE
+ value expressions: _col0 (type: bigint), _col1
(type: bigint), _col2 (type: binary)
Reducer 12
Execution mode: vectorized, llap
Reduce Operator Tree:
- Select Operator
- expressions: KEY.reducesinkkey0 (type: bigint)
- outputColumnNames: _col0
+ Group By Operator
+ aggregations: min(VALUE._col0), max(VALUE._col1),
bloom_filter(VALUE._col2, 1, expectedEntries=160)
+ mode: final
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 1 Data size: 160 Basic stats: COMPLETE
Column stats: COMPLETE
Reduce Output Operator
- key expressions: _col0 (type: bigint)
- null sort order: z
- sort order: +
- Map-reduce partition columns: _col0 (type: bigint)
- Statistics: Num rows: 367 Data size: 2936 Basic stats:
COMPLETE Column stats: COMPLETE
+ null sort order:
+ sort order:
+ Statistics: Num rows: 1 Data size: 160 Basic stats: COMPLETE
Column stats: COMPLETE
+ value expressions: _col0 (type: bigint), _col1 (type:
bigint), _col2 (type: binary)
Reducer 2
Execution mode: vectorized, llap
Reduce Operator Tree:
@@ -226,7 +317,7 @@ STAGE PLANS:
1 _col0 (type: bigint)
outputColumnNames: _col0, _col1, _col2
input vertices:
- 1 Map 8
+ 1 Map 6
Statistics: Num rows: 33743267 Data size: 3779245920
Basic stats: COMPLETE Column stats: COMPLETE
Reduce Output Operator
key expressions: _col0 (type: bigint)
@@ -261,7 +352,7 @@ STAGE PLANS:
1 KEY.reducesinkkey0 (type: bigint)
outputColumnNames: _col1, _col2, _col5
input vertices:
- 1 Map 10
+ 1 Map 8
Statistics: Num rows: 33743267 Data size: 7153572612 Basic
stats: COMPLETE Column stats: COMPLETE
DynamicPartitionHashJoin: true
Map Join Operator
@@ -272,7 +363,7 @@ STAGE PLANS:
1 _col1 (type: bigint)
outputColumnNames: _col2, _col5, _col6
input vertices:
- 1 Reducer 7
+ 1 Reducer 11
Statistics: Num rows: 33954162 Data size: 11001148488 Basic
stats: COMPLETE Column stats: COMPLETE
Filter Operator
predicate: (_col2 > _col6) (type: boolean)
@@ -322,57 +413,7 @@ STAGE PLANS:
sort order:
Statistics: Num rows: 1 Data size: 160 Basic stats: COMPLETE
Column stats: COMPLETE
value expressions: _col0 (type: bigint), _col1 (type:
bigint), _col2 (type: binary)
- Reducer 6
- Execution mode: vectorized, llap
- Reduce Operator Tree:
- Group By Operator
- aggregations: sum(VALUE._col0)
- keys: KEY._col0 (type: bigint), KEY._col1 (type: bigint)
- mode: mergepartial
- outputColumnNames: _col0, _col1, _col2
- Statistics: Num rows: 1499681380 Data size: 189585719944 Basic
stats: COMPLETE Column stats: COMPLETE
- Select Operator
- expressions: _col0 (type: bigint), _col2 (type:
decimal(17,2))
- outputColumnNames: _col1, _col2
- Statistics: Num rows: 1499681380 Data size: 189585719944
Basic stats: COMPLETE Column stats: COMPLETE
- Group By Operator
- aggregations: sum(_col2), count(_col2)
- keys: _col1 (type: bigint)
- minReductionHashAggr: 0.99
- mode: hash
- outputColumnNames: _col0, _col1, _col2
- Statistics: Num rows: 119301 Data size: 15175776 Basic
stats: COMPLETE Column stats: COMPLETE
- Reduce Output Operator
- key expressions: _col0 (type: bigint)
- null sort order: z
- sort order: +
- Map-reduce partition columns: _col0 (type: bigint)
- Statistics: Num rows: 119301 Data size: 15175776 Basic
stats: COMPLETE Column stats: COMPLETE
- value expressions: _col1 (type: decimal(27,2)), _col2
(type: bigint)
Reducer 7
- Execution mode: vectorized, llap
- Reduce Operator Tree:
- Group By Operator
- aggregations: sum(VALUE._col0), count(VALUE._col1)
- keys: KEY._col0 (type: bigint)
- mode: mergepartial
- outputColumnNames: _col0, _col1, _col2
- Statistics: Num rows: 161 Data size: 20488 Basic stats:
COMPLETE Column stats: COMPLETE
- Filter Operator
- predicate: CAST( (_col1 / _col2) AS decimal(21,6)) is not
null (type: boolean)
- Statistics: Num rows: 161 Data size: 20488 Basic stats:
COMPLETE Column stats: COMPLETE
- Select Operator
- expressions: (CAST( (_col1 / _col2) AS decimal(21,6)) *
1.2) (type: decimal(24,7)), _col0 (type: bigint)
- outputColumnNames: _col0, _col1
- Statistics: Num rows: 161 Data size: 19200 Basic stats:
COMPLETE Column stats: COMPLETE
- Reduce Output Operator
- key expressions: _col1 (type: bigint)
- null sort order: z
- sort order: +
- Map-reduce partition columns: _col1 (type: bigint)
- Statistics: Num rows: 161 Data size: 19200 Basic stats:
COMPLETE Column stats: COMPLETE
- value expressions: _col0 (type: decimal(24,7))
- Reducer 9
Execution mode: vectorized, llap
Reduce Operator Tree:
Group By Operator
@@ -385,6 +426,11 @@ STAGE PLANS:
sort order:
Statistics: Num rows: 1 Data size: 160 Basic stats: COMPLETE
Column stats: COMPLETE
value expressions: _col0 (type: bigint), _col1 (type:
bigint), _col2 (type: binary)
+ Reduce Output Operator
+ null sort order:
+ sort order:
+ Statistics: Num rows: 1 Data size: 160 Basic stats: COMPLETE
Column stats: COMPLETE
+ value expressions: _col0 (type: bigint), _col1 (type:
bigint), _col2 (type: binary)
Stage: Stage-0
Fetch Operator
diff --git
a/ql/src/test/results/clientpositive/perf/tpcds30tb/tez/query24.q.out
b/ql/src/test/results/clientpositive/perf/tpcds30tb/tez/query24.q.out
index dd67455587e..68170ac209a 100644
--- a/ql/src/test/results/clientpositive/perf/tpcds30tb/tez/query24.q.out
+++ b/ql/src/test/results/clientpositive/perf/tpcds30tb/tez/query24.q.out
@@ -8,16 +8,17 @@ STAGE PLANS:
Tez
#### A masked pattern was here ####
Edges:
- Map 1 <- Reducer 12 (BROADCAST_EDGE)
+ Map 1 <- Reducer 12 (BROADCAST_EDGE), Reducer 13 (BROADCAST_EDGE),
Reducer 15 (BROADCAST_EDGE)
Map 11 <- Map 9 (BROADCAST_EDGE)
- Map 8 <- Reducer 14 (BROADCAST_EDGE)
+ Map 8 <- Reducer 15 (BROADCAST_EDGE)
Map 9 <- Map 10 (BROADCAST_EDGE)
Reducer 12 <- Map 11 (CUSTOM_SIMPLE_EDGE)
- Reducer 14 <- Map 13 (CUSTOM_SIMPLE_EDGE)
- Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE), Map 11 (BROADCAST_EDGE), Map
13 (BROADCAST_EDGE), Map 8 (CUSTOM_SIMPLE_EDGE)
+ Reducer 13 <- Map 11 (CUSTOM_SIMPLE_EDGE)
+ Reducer 15 <- Map 14 (CUSTOM_SIMPLE_EDGE)
+ Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE), Map 11 (BROADCAST_EDGE), Map
14 (BROADCAST_EDGE), Map 8 (CUSTOM_SIMPLE_EDGE)
Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
Reducer 4 <- Reducer 3 (SIMPLE_EDGE)
- Reducer 5 <- Map 1 (CUSTOM_SIMPLE_EDGE), Map 11 (BROADCAST_EDGE), Map
13 (BROADCAST_EDGE), Map 15 (CUSTOM_SIMPLE_EDGE)
+ Reducer 5 <- Map 1 (CUSTOM_SIMPLE_EDGE), Map 11 (BROADCAST_EDGE), Map
14 (BROADCAST_EDGE), Map 16 (CUSTOM_SIMPLE_EDGE)
Reducer 6 <- Reducer 5 (SIMPLE_EDGE)
Reducer 7 <- Reducer 4 (BROADCAST_EDGE), Reducer 6 (CUSTOM_SIMPLE_EDGE)
#### A masked pattern was here ####
@@ -26,10 +27,10 @@ STAGE PLANS:
Map Operator Tree:
TableScan
alias: store_sales
- filterExpr: (ss_store_sk is not null and ss_customer_sk is
not null and ss_store_sk BETWEEN DynamicValue(RS[300]_col0) AND
DynamicValue(RS[300]_col1) and ss_customer_sk BETWEEN
DynamicValue(RS[300]_col2) AND DynamicValue(RS[300]_col3) and
in_bloom_filter(hash(ss_store_sk,ss_customer_sk), DynamicValue(RS[300]_col4)))
(type: boolean)
+ filterExpr: (ss_store_sk is not null and ss_customer_sk is
not null and ((ss_item_sk BETWEEN DynamicValue(RS_32_item_i_item_sk_min) AND
DynamicValue(RS_32_item_i_item_sk_max) and in_bloom_filter(ss_item_sk,
DynamicValue(RS_32_item_i_item_sk_bloom_filter)) and ss_store_sk BETWEEN
DynamicValue(RS[300]_col0) AND DynamicValue(RS[300]_col1) and ss_customer_sk
BETWEEN DynamicValue(RS[300]_col2) AND DynamicValue(RS[300]_col3) and
in_bloom_filter(hash(ss_store_sk,ss_customer_sk [...]
Statistics: Num rows: 86404891377 Data size: 11944483020904
Basic stats: COMPLETE Column stats: COMPLETE
Filter Operator
- predicate: (ss_store_sk is not null and ss_customer_sk is
not null and ss_store_sk BETWEEN DynamicValue(RS[300]_col0) AND
DynamicValue(RS[300]_col1) and ss_customer_sk BETWEEN
DynamicValue(RS[300]_col2) AND DynamicValue(RS[300]_col3) and
in_bloom_filter(hash(ss_store_sk,ss_customer_sk), DynamicValue(RS[300]_col4)))
(type: boolean)
+ predicate: (ss_store_sk is not null and ss_customer_sk is
not null and ss_store_sk BETWEEN DynamicValue(RS[300]_col0) AND
DynamicValue(RS[300]_col1) and ss_customer_sk BETWEEN
DynamicValue(RS[300]_col2) AND DynamicValue(RS[300]_col3) and ss_item_sk
BETWEEN DynamicValue(RS_32_item_i_item_sk_min) AND
DynamicValue(RS_32_item_i_item_sk_max) and
in_bloom_filter(hash(ss_store_sk,ss_customer_sk), DynamicValue(RS[300]_col4))
and in_bloom_filter(ss_item_sk, DynamicValue(RS_32_ [...]
Statistics: Num rows: 78797296641 Data size:
10892820496840 Basic stats: COMPLETE Column stats: COMPLETE
Select Operator
expressions: ss_item_sk (type: bigint), ss_customer_sk
(type: bigint), ss_store_sk (type: bigint), ss_ticket_number (type: bigint),
ss_sales_price (type: decimal(7,2))
@@ -42,6 +43,13 @@ STAGE PLANS:
Map-reduce partition columns: _col0 (type: bigint),
_col3 (type: bigint)
Statistics: Num rows: 78797296641 Data size:
10892820496840 Basic stats: COMPLETE Column stats: COMPLETE
value expressions: _col1 (type: bigint), _col2 (type:
bigint), _col4 (type: decimal(7,2))
+ Filter Operator
+ predicate: (ss_store_sk is not null and ss_customer_sk is
not null and ss_store_sk BETWEEN DynamicValue(RS[315]_col0) AND
DynamicValue(RS[315]_col1) and ss_customer_sk BETWEEN
DynamicValue(RS[315]_col2) AND DynamicValue(RS[315]_col3) and
in_bloom_filter(hash(ss_store_sk,ss_customer_sk), DynamicValue(RS[315]_col4)))
(type: boolean)
+ Statistics: Num rows: 78797296641 Data size:
10892820496840 Basic stats: COMPLETE Column stats: COMPLETE
+ Select Operator
+ expressions: ss_item_sk (type: bigint), ss_customer_sk
(type: bigint), ss_store_sk (type: bigint), ss_ticket_number (type: bigint),
ss_sales_price (type: decimal(7,2))
+ outputColumnNames: _col0, _col1, _col2, _col3, _col4
+ Statistics: Num rows: 78797296641 Data size:
10892820496840 Basic stats: COMPLETE Column stats: COMPLETE
Reduce Output Operator
key expressions: _col0 (type: bigint), _col3 (type:
bigint)
null sort order: zz
@@ -132,9 +140,24 @@ STAGE PLANS:
Map-reduce partition columns: _col9 (type:
bigint), _col0 (type: bigint)
Statistics: Num rows: 7981221 Data size:
3639436776 Basic stats: COMPLETE Column stats: COMPLETE
value expressions: _col2 (type: char(20)), _col3
(type: char(30)), _col6 (type: char(2)), _col10 (type: varchar(50)), _col11
(type: char(2))
+ Select Operator
+ expressions: _col9 (type: bigint), _col0 (type:
bigint), hash(_col9,_col0) (type: int)
+ outputColumnNames: _col0, _col1, _col3
+ Statistics: Num rows: 7981221 Data size:
159624420 Basic stats: COMPLETE Column stats: COMPLETE
+ Group By Operator
+ aggregations: min(_col0), max(_col0),
min(_col1), max(_col1), bloom_filter(_col3, expectedEntries=7981221)
+ minReductionHashAggr: 0.99
+ mode: hash
+ outputColumnNames: _col0, _col1, _col2, _col3,
_col4
+ Statistics: Num rows: 1 Data size: 176 Basic
stats: COMPLETE Column stats: COMPLETE
+ Reduce Output Operator
+ null sort order:
+ sort order:
+ Statistics: Num rows: 1 Data size: 176 Basic
stats: COMPLETE Column stats: COMPLETE
+ value expressions: _col0 (type: bigint),
_col1 (type: bigint), _col2 (type: bigint), _col3 (type: bigint), _col4 (type:
binary)
Execution mode: vectorized, llap
LLAP IO: may be used (ACID table)
- Map 13
+ Map 14
Map Operator Tree:
TableScan
alias: item
@@ -181,7 +204,7 @@ STAGE PLANS:
value expressions: _col1 (type: decimal(7,2)), _col2
(type: char(20)), _col3 (type: char(20)), _col4 (type: char(10)), _col5 (type:
int)
Execution mode: vectorized, llap
LLAP IO: may be used (ACID table)
- Map 15
+ Map 16
Map Operator Tree:
TableScan
alias: store_returns
@@ -265,7 +288,20 @@ STAGE PLANS:
sort order:
Statistics: Num rows: 1 Data size: 176 Basic stats: COMPLETE
Column stats: COMPLETE
value expressions: _col0 (type: bigint), _col1 (type:
bigint), _col2 (type: bigint), _col3 (type: bigint), _col4 (type: binary)
- Reducer 14
+ Reducer 13
+ Execution mode: vectorized, llap
+ Reduce Operator Tree:
+ Group By Operator
+ aggregations: min(VALUE._col0), max(VALUE._col1),
min(VALUE._col2), max(VALUE._col3), bloom_filter(VALUE._col4, 1,
expectedEntries=7981221)
+ mode: final
+ outputColumnNames: _col0, _col1, _col2, _col3, _col4
+ Statistics: Num rows: 1 Data size: 176 Basic stats: COMPLETE
Column stats: COMPLETE
+ Reduce Output Operator
+ null sort order:
+ sort order:
+ Statistics: Num rows: 1 Data size: 176 Basic stats: COMPLETE
Column stats: COMPLETE
+ value expressions: _col0 (type: bigint), _col1 (type:
bigint), _col2 (type: bigint), _col3 (type: bigint), _col4 (type: binary)
+ Reducer 15
Execution mode: vectorized, llap
Reduce Operator Tree:
Group By Operator
@@ -278,6 +314,11 @@ STAGE PLANS:
sort order:
Statistics: Num rows: 1 Data size: 160 Basic stats: COMPLETE
Column stats: COMPLETE
value expressions: _col0 (type: bigint), _col1 (type:
bigint), _col2 (type: binary)
+ Reduce Output Operator
+ null sort order:
+ sort order:
+ Statistics: Num rows: 1 Data size: 160 Basic stats: COMPLETE
Column stats: COMPLETE
+ value expressions: _col0 (type: bigint), _col1 (type:
bigint), _col2 (type: binary)
Reducer 2
Execution mode: vectorized, llap
Reduce Operator Tree:
@@ -310,7 +351,7 @@ STAGE PLANS:
1 _col0 (type: bigint)
outputColumnNames: _col4, _col9, _col10, _col13, _col17,
_col18, _col21, _col22, _col23, _col24
input vertices:
- 1 Map 13
+ 1 Map 14
Statistics: Num rows: 101092197 Data size: 73999487040
Basic stats: COMPLETE Column stats: COMPLETE
Group By Operator
aggregations: sum(_col4)
@@ -385,7 +426,7 @@ STAGE PLANS:
1 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1
(type: bigint)
outputColumnNames: _col0, _col1, _col2, _col4
input vertices:
- 1 Map 15
+ 1 Map 16
Statistics: Num rows: 94492919160 Data size: 12397046786296
Basic stats: COMPLETE Column stats: COMPLETE
DynamicPartitionHashJoin: true
Map Join Operator
@@ -406,7 +447,7 @@ STAGE PLANS:
1 _col0 (type: bigint)
outputColumnNames: _col4, _col9, _col10, _col13, _col17,
_col18, _col21, _col22, _col23, _col24, _col25
input vertices:
- 1 Map 13
+ 1 Map 14
Statistics: Num rows: 9604070077 Data size: 8563387868485
Basic stats: COMPLETE Column stats: COMPLETE
Group By Operator
aggregations: sum(_col4)
diff --git
a/ql/src/test/results/clientpositive/perf/tpcds30tb/tez/query64.q.out
b/ql/src/test/results/clientpositive/perf/tpcds30tb/tez/query64.q.out
index aca46ad2c1e..d38476f52ca 100644
--- a/ql/src/test/results/clientpositive/perf/tpcds30tb/tez/query64.q.out
+++ b/ql/src/test/results/clientpositive/perf/tpcds30tb/tez/query64.q.out
@@ -7,43 +7,45 @@ STAGE PLANS:
Tez
#### A masked pattern was here ####
Edges:
- Map 1 <- Map 17 (BROADCAST_EDGE), Map 29 (BROADCAST_EDGE)
- Map 11 <- Reducer 30 (BROADCAST_EDGE), Reducer 31 (BROADCAST_EDGE)
- Map 23 <- Map 17 (BROADCAST_EDGE), Map 29 (BROADCAST_EDGE), Reducer 5
(BROADCAST_EDGE)
- Map 33 <- Reducer 30 (BROADCAST_EDGE), Reducer 31 (BROADCAST_EDGE)
- Map 6 <- Reducer 30 (BROADCAST_EDGE), Reducer 31 (BROADCAST_EDGE)
- Reducer 10 <- Reducer 9 (SIMPLE_EDGE)
- Reducer 15 <- Map 14 (SIMPLE_EDGE)
- Reducer 16 <- Map 14 (SIMPLE_EDGE)
- Reducer 18 <- Map 17 (SIMPLE_EDGE)
- Reducer 19 <- Map 17 (SIMPLE_EDGE)
- Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE), Map 12 (BROADCAST_EDGE), Map
32 (CUSTOM_SIMPLE_EDGE), Reducer 8 (BROADCAST_EDGE)
- Reducer 21 <- Map 20 (SIMPLE_EDGE)
- Reducer 22 <- Map 20 (SIMPLE_EDGE)
- Reducer 24 <- Map 12 (BROADCAST_EDGE), Map 23 (CUSTOM_SIMPLE_EDGE),
Map 32 (CUSTOM_SIMPLE_EDGE), Reducer 10 (BROADCAST_EDGE)
- Reducer 25 <- Map 12 (BROADCAST_EDGE), Map 13 (BROADCAST_EDGE), Map 14
(BROADCAST_EDGE), Map 17 (BROADCAST_EDGE), Map 20 (BROADCAST_EDGE), Map 33
(CUSTOM_SIMPLE_EDGE), Reducer 16 (BROADCAST_EDGE), Reducer 19 (BROADCAST_EDGE),
Reducer 22 (BROADCAST_EDGE), Reducer 24 (CUSTOM_SIMPLE_EDGE)
- Reducer 26 <- Reducer 25 (SIMPLE_EDGE)
- Reducer 27 <- Reducer 26 (CUSTOM_SIMPLE_EDGE), Reducer 4
(CUSTOM_SIMPLE_EDGE)
+ Map 1 <- Map 19 (BROADCAST_EDGE), Map 31 (BROADCAST_EDGE), Reducer 9
(BROADCAST_EDGE)
+ Map 13 <- Reducer 32 (BROADCAST_EDGE), Reducer 33 (BROADCAST_EDGE)
+ Map 25 <- Map 19 (BROADCAST_EDGE), Map 31 (BROADCAST_EDGE), Reducer 12
(BROADCAST_EDGE), Reducer 5 (BROADCAST_EDGE)
+ Map 35 <- Reducer 12 (BROADCAST_EDGE), Reducer 32 (BROADCAST_EDGE),
Reducer 33 (BROADCAST_EDGE), Reducer 9 (BROADCAST_EDGE)
+ Map 6 <- Reducer 32 (BROADCAST_EDGE), Reducer 33 (BROADCAST_EDGE)
+ Reducer 10 <- Map 13 (CUSTOM_SIMPLE_EDGE), Map 6 (CUSTOM_SIMPLE_EDGE)
+ Reducer 11 <- Reducer 10 (SIMPLE_EDGE)
+ Reducer 12 <- Reducer 11 (CUSTOM_SIMPLE_EDGE)
+ Reducer 17 <- Map 16 (SIMPLE_EDGE)
+ Reducer 18 <- Map 16 (SIMPLE_EDGE)
+ Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE), Map 14 (BROADCAST_EDGE), Map
34 (CUSTOM_SIMPLE_EDGE), Reducer 8 (BROADCAST_EDGE)
+ Reducer 20 <- Map 19 (SIMPLE_EDGE)
+ Reducer 21 <- Map 19 (SIMPLE_EDGE)
+ Reducer 23 <- Map 22 (SIMPLE_EDGE)
+ Reducer 24 <- Map 22 (SIMPLE_EDGE)
+ Reducer 26 <- Map 14 (BROADCAST_EDGE), Map 25 (CUSTOM_SIMPLE_EDGE),
Map 34 (CUSTOM_SIMPLE_EDGE), Reducer 11 (BROADCAST_EDGE)
+ Reducer 27 <- Map 14 (BROADCAST_EDGE), Map 15 (BROADCAST_EDGE), Map 16
(BROADCAST_EDGE), Map 19 (BROADCAST_EDGE), Map 22 (BROADCAST_EDGE), Map 35
(CUSTOM_SIMPLE_EDGE), Reducer 18 (BROADCAST_EDGE), Reducer 21 (BROADCAST_EDGE),
Reducer 24 (BROADCAST_EDGE), Reducer 26 (CUSTOM_SIMPLE_EDGE)
Reducer 28 <- Reducer 27 (SIMPLE_EDGE)
- Reducer 3 <- Map 12 (BROADCAST_EDGE), Map 13 (BROADCAST_EDGE), Map 14
(BROADCAST_EDGE), Map 17 (BROADCAST_EDGE), Map 20 (BROADCAST_EDGE), Map 33
(CUSTOM_SIMPLE_EDGE), Reducer 15 (BROADCAST_EDGE), Reducer 18 (BROADCAST_EDGE),
Reducer 2 (CUSTOM_SIMPLE_EDGE), Reducer 21 (BROADCAST_EDGE)
- Reducer 30 <- Map 29 (CUSTOM_SIMPLE_EDGE)
- Reducer 31 <- Map 29 (CUSTOM_SIMPLE_EDGE)
+ Reducer 29 <- Reducer 28 (CUSTOM_SIMPLE_EDGE), Reducer 4
(CUSTOM_SIMPLE_EDGE)
+ Reducer 3 <- Map 14 (BROADCAST_EDGE), Map 15 (BROADCAST_EDGE), Map 16
(BROADCAST_EDGE), Map 19 (BROADCAST_EDGE), Map 22 (BROADCAST_EDGE), Map 35
(CUSTOM_SIMPLE_EDGE), Reducer 17 (BROADCAST_EDGE), Reducer 2
(CUSTOM_SIMPLE_EDGE), Reducer 20 (BROADCAST_EDGE), Reducer 23 (BROADCAST_EDGE)
+ Reducer 30 <- Reducer 29 (SIMPLE_EDGE)
+ Reducer 32 <- Map 31 (CUSTOM_SIMPLE_EDGE)
+ Reducer 33 <- Map 31 (CUSTOM_SIMPLE_EDGE)
Reducer 4 <- Reducer 3 (SIMPLE_EDGE)
Reducer 5 <- Reducer 4 (CUSTOM_SIMPLE_EDGE)
- Reducer 7 <- Map 11 (CUSTOM_SIMPLE_EDGE), Map 6 (CUSTOM_SIMPLE_EDGE)
+ Reducer 7 <- Map 13 (CUSTOM_SIMPLE_EDGE), Map 6 (CUSTOM_SIMPLE_EDGE)
Reducer 8 <- Reducer 7 (SIMPLE_EDGE)
- Reducer 9 <- Map 11 (CUSTOM_SIMPLE_EDGE), Map 6 (CUSTOM_SIMPLE_EDGE)
+ Reducer 9 <- Reducer 8 (CUSTOM_SIMPLE_EDGE)
#### A masked pattern was here ####
Vertices:
Map 1
Map Operator Tree:
TableScan
alias: store_sales
- filterExpr: (ss_cdemo_sk is not null and ss_addr_sk is not
null and ss_hdemo_sk is not null and ss_customer_sk is not null and ss_store_sk
is not null and ss_promo_sk is not null) (type: boolean)
+ filterExpr: (ss_cdemo_sk is not null and ss_addr_sk is not
null and ss_hdemo_sk is not null and ss_customer_sk is not null and ss_store_sk
is not null and ss_promo_sk is not null and ss_item_sk BETWEEN
DynamicValue(RS_58_catalog_sales_cs_item_sk_min) AND
DynamicValue(RS_58_catalog_sales_cs_item_sk_max) and
in_bloom_filter(ss_item_sk,
DynamicValue(RS_58_catalog_sales_cs_item_sk_bloom_filter))) (type: boolean)
probeDecodeDetails: cacheKey:HASH_MAP_MAPJOIN_993_container,
bigKeyColName:ss_item_sk, smallTablePos:1, keyRatio:1.8543009129597497E-9
Statistics: Num rows: 82510879939 Data size: 32917667058984
Basic stats: COMPLETE Column stats: COMPLETE
Filter Operator
- predicate: (ss_cdemo_sk is not null and ss_addr_sk is not
null and ss_hdemo_sk is not null and ss_customer_sk is not null and ss_store_sk
is not null and ss_promo_sk is not null) (type: boolean)
+ predicate: (ss_cdemo_sk is not null and ss_addr_sk is not
null and ss_hdemo_sk is not null and ss_customer_sk is not null and ss_store_sk
is not null and ss_promo_sk is not null and ss_item_sk BETWEEN
DynamicValue(RS_58_catalog_sales_cs_item_sk_min) AND
DynamicValue(RS_58_catalog_sales_cs_item_sk_max) and
in_bloom_filter(ss_item_sk,
DynamicValue(RS_58_catalog_sales_cs_item_sk_bloom_filter))) (type: boolean)
Statistics: Num rows: 71511093715 Data size:
28529308809584 Basic stats: COMPLETE Column stats: COMPLETE
Select Operator
expressions: ss_item_sk (type: bigint), ss_customer_sk
(type: bigint), ss_cdemo_sk (type: bigint), ss_hdemo_sk (type: bigint),
ss_addr_sk (type: bigint), ss_store_sk (type: bigint), ss_ticket_number (type:
bigint), ss_wholesale_cost (type: decimal(7,2)), ss_list_price (type:
decimal(7,2)), ss_coupon_amt (type: decimal(7,2)), ss_sold_date_sk (type:
bigint)
@@ -57,7 +59,7 @@ STAGE PLANS:
1 _col0 (type: bigint)
outputColumnNames: _col0, _col1, _col2, _col3, _col4,
_col5, _col6, _col7, _col8, _col9, _col10, _col11
input vertices:
- 1 Map 29
+ 1 Map 31
Statistics: Num rows: 1300511220 Data size:
41616359416 Basic stats: COMPLETE Column stats: COMPLETE
Map Join Operator
condition map:
@@ -67,7 +69,7 @@ STAGE PLANS:
1 _col0 (type: bigint)
outputColumnNames: _col0, _col1, _col2, _col3,
_col4, _col5, _col6, _col7, _col8, _col9, _col11
input vertices:
- 1 Map 17
+ 1 Map 19
Statistics: Num rows: 261380636 Data size:
6273135640 Basic stats: COMPLETE Column stats: COMPLETE
Reduce Output Operator
key expressions: _col1 (type: bigint)
@@ -78,7 +80,7 @@ STAGE PLANS:
value expressions: _col0 (type: bigint), _col2
(type: bigint), _col3 (type: bigint), _col4 (type: bigint), _col5 (type:
bigint), _col6 (type: bigint), _col7 (type: decimal(7,2)), _col8 (type:
decimal(7,2)), _col9 (type: decimal(7,2)), _col11 (type: bigint)
Execution mode: vectorized, llap
LLAP IO: may be used (ACID table)
- Map 11
+ Map 13
Map Operator Tree:
TableScan
alias: catalog_returns
@@ -114,7 +116,7 @@ STAGE PLANS:
value expressions: _col2 (type: decimal(9,2))
Execution mode: vectorized, llap
LLAP IO: may be used (ACID table)
- Map 12
+ Map 14
Map Operator Tree:
TableScan
alias: ad1
@@ -153,7 +155,7 @@ STAGE PLANS:
value expressions: _col1 (type: char(10)), _col2 (type:
varchar(60)), _col3 (type: varchar(60)), _col4 (type: char(10))
Execution mode: vectorized, llap
LLAP IO: may be used (ACID table)
- Map 13
+ Map 15
Map Operator Tree:
TableScan
alias: store
@@ -182,7 +184,7 @@ STAGE PLANS:
value expressions: _col1 (type: varchar(50)), _col2
(type: char(10))
Execution mode: vectorized, llap
LLAP IO: may be used (ACID table)
- Map 14
+ Map 16
Map Operator Tree:
TableScan
alias: hd1
@@ -221,7 +223,7 @@ STAGE PLANS:
Statistics: Num rows: 7200 Data size: 57600 Basic
stats: COMPLETE Column stats: COMPLETE
Execution mode: vectorized, llap
LLAP IO: may be used (ACID table)
- Map 17
+ Map 19
Map Operator Tree:
TableScan
alias: d2
@@ -286,7 +288,7 @@ STAGE PLANS:
Target Input: store_sales
Partition key expr: ss_sold_date_sk
Statistics: Num rows: 367 Data size: 2936 Basic
stats: COMPLETE Column stats: COMPLETE
- Target Vertex: Map 23
+ Target Vertex: Map 25
Filter Operator
predicate: (d_year = 2001) (type: boolean)
Statistics: Num rows: 367 Data size: 4404 Basic stats:
COMPLETE Column stats: COMPLETE
@@ -318,7 +320,7 @@ STAGE PLANS:
Target Vertex: Map 1
Execution mode: vectorized, llap
LLAP IO: may be used (ACID table)
- Map 20
+ Map 22
Map Operator Tree:
TableScan
alias: cd1
@@ -357,15 +359,15 @@ STAGE PLANS:
value expressions: _col1 (type: char(1))
Execution mode: vectorized, llap
LLAP IO: may be used (ACID table)
- Map 23
+ Map 25
Map Operator Tree:
TableScan
alias: store_sales
- filterExpr: (ss_cdemo_sk is not null and ss_addr_sk is not
null and ss_hdemo_sk is not null and ss_customer_sk is not null and ss_store_sk
is not null and ss_promo_sk is not null and ss_item_sk BETWEEN
DynamicValue(RS_196_item_i_item_sk_min) AND
DynamicValue(RS_196_item_i_item_sk_max) and in_bloom_filter(ss_item_sk,
DynamicValue(RS_196_item_i_item_sk_bloom_filter))) (type: boolean)
+ filterExpr: (ss_cdemo_sk is not null and ss_addr_sk is not
null and ss_hdemo_sk is not null and ss_customer_sk is not null and ss_store_sk
is not null and ss_promo_sk is not null and ss_item_sk BETWEEN
DynamicValue(RS_156_catalog_sales_cs_item_sk_min) AND
DynamicValue(RS_156_catalog_sales_cs_item_sk_max) and ss_item_sk BETWEEN
DynamicValue(RS_196_item_i_item_sk_min) AND
DynamicValue(RS_196_item_i_item_sk_max) and in_bloom_filter(ss_item_sk,
DynamicValue(RS_156_catalog_s [...]
probeDecodeDetails:
cacheKey:HASH_MAP_MAPJOIN_1008_container, bigKeyColName:ss_item_sk,
smallTablePos:1, keyRatio:1.8543009129597497E-9
Statistics: Num rows: 82510879939 Data size: 32917667058984
Basic stats: COMPLETE Column stats: COMPLETE
Filter Operator
- predicate: (ss_cdemo_sk is not null and ss_addr_sk is not
null and ss_hdemo_sk is not null and ss_customer_sk is not null and ss_store_sk
is not null and ss_promo_sk is not null and ss_item_sk BETWEEN
DynamicValue(RS_196_item_i_item_sk_min) AND
DynamicValue(RS_196_item_i_item_sk_max) and in_bloom_filter(ss_item_sk,
DynamicValue(RS_196_item_i_item_sk_bloom_filter))) (type: boolean)
+ predicate: (ss_cdemo_sk is not null and ss_addr_sk is not
null and ss_hdemo_sk is not null and ss_customer_sk is not null and ss_store_sk
is not null and ss_promo_sk is not null and ss_item_sk BETWEEN
DynamicValue(RS_196_item_i_item_sk_min) AND
DynamicValue(RS_196_item_i_item_sk_max) and ss_item_sk BETWEEN
DynamicValue(RS_156_catalog_sales_cs_item_sk_min) AND
DynamicValue(RS_156_catalog_sales_cs_item_sk_max) and
in_bloom_filter(ss_item_sk, DynamicValue(RS_196_item_i_i [...]
Statistics: Num rows: 71511093715 Data size:
28529308809584 Basic stats: COMPLETE Column stats: COMPLETE
Select Operator
expressions: ss_item_sk (type: bigint), ss_customer_sk
(type: bigint), ss_cdemo_sk (type: bigint), ss_hdemo_sk (type: bigint),
ss_addr_sk (type: bigint), ss_store_sk (type: bigint), ss_ticket_number (type:
bigint), ss_wholesale_cost (type: decimal(7,2)), ss_list_price (type:
decimal(7,2)), ss_coupon_amt (type: decimal(7,2)), ss_sold_date_sk (type:
bigint)
@@ -379,7 +381,7 @@ STAGE PLANS:
1 _col0 (type: bigint)
outputColumnNames: _col0, _col1, _col2, _col3, _col4,
_col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12
input vertices:
- 1 Map 29
+ 1 Map 31
Statistics: Num rows: 1300511220 Data size:
180771059956 Basic stats: COMPLETE Column stats: COMPLETE
Map Join Operator
condition map:
@@ -389,7 +391,7 @@ STAGE PLANS:
1 _col0 (type: bigint)
outputColumnNames: _col0, _col1, _col2, _col3,
_col4, _col5, _col6, _col7, _col8, _col9, _col11, _col12
input vertices:
- 1 Map 17
+ 1 Map 19
Statistics: Num rows: 261380636 Data size:
34240863692 Basic stats: COMPLETE Column stats: COMPLETE
Reduce Output Operator
key expressions: _col1 (type: bigint)
@@ -400,7 +402,7 @@ STAGE PLANS:
value expressions: _col0 (type: bigint), _col2
(type: bigint), _col3 (type: bigint), _col4 (type: bigint), _col5 (type:
bigint), _col6 (type: bigint), _col7 (type: decimal(7,2)), _col8 (type:
decimal(7,2)), _col9 (type: decimal(7,2)), _col11 (type: bigint), _col12 (type:
char(50))
Execution mode: vectorized, llap
LLAP IO: may be used (ACID table)
- Map 29
+ Map 31
Map Operator Tree:
TableScan
alias: item
@@ -462,7 +464,7 @@ STAGE PLANS:
value expressions: _col0 (type: bigint), _col1
(type: bigint), _col2 (type: binary)
Execution mode: vectorized, llap
LLAP IO: may be used (ACID table)
- Map 32
+ Map 34
Map Operator Tree:
TableScan
alias: customer
@@ -491,14 +493,14 @@ STAGE PLANS:
value expressions: _col1 (type: bigint), _col2 (type:
bigint), _col3 (type: bigint), _col4 (type: bigint), _col5 (type: bigint)
Execution mode: vectorized, llap
LLAP IO: may be used (ACID table)
- Map 33
+ Map 35
Map Operator Tree:
TableScan
alias: store_returns
- filterExpr: ((sr_item_sk BETWEEN
DynamicValue(RS_147_item_i_item_sk_min) AND
DynamicValue(RS_147_item_i_item_sk_max) and in_bloom_filter(sr_item_sk,
DynamicValue(RS_147_item_i_item_sk_bloom_filter))) or (sr_item_sk BETWEEN
DynamicValue(RS_49_item_i_item_sk_min) AND
DynamicValue(RS_49_item_i_item_sk_max) and in_bloom_filter(sr_item_sk,
DynamicValue(RS_49_item_i_item_sk_bloom_filter)))) (type: boolean)
+ filterExpr: ((sr_item_sk BETWEEN
DynamicValue(RS_156_catalog_sales_cs_item_sk_min) AND
DynamicValue(RS_156_catalog_sales_cs_item_sk_max) and sr_item_sk BETWEEN
DynamicValue(RS_147_item_i_item_sk_min) AND
DynamicValue(RS_147_item_i_item_sk_max) and in_bloom_filter(sr_item_sk,
DynamicValue(RS_156_catalog_sales_cs_item_sk_bloom_filter)) and
in_bloom_filter(sr_item_sk, DynamicValue(RS_147_item_i_item_sk_bloom_filter)))
or (sr_item_sk BETWEEN DynamicValue(RS_58_catalog_sales [...]
Statistics: Num rows: 8634166995 Data size: 138146671920
Basic stats: COMPLETE Column stats: COMPLETE
Filter Operator
- predicate: (sr_item_sk BETWEEN
DynamicValue(RS_147_item_i_item_sk_min) AND
DynamicValue(RS_147_item_i_item_sk_max) and in_bloom_filter(sr_item_sk,
DynamicValue(RS_147_item_i_item_sk_bloom_filter))) (type: boolean)
+ predicate: (sr_item_sk BETWEEN
DynamicValue(RS_147_item_i_item_sk_min) AND
DynamicValue(RS_147_item_i_item_sk_max) and sr_item_sk BETWEEN
DynamicValue(RS_156_catalog_sales_cs_item_sk_min) AND
DynamicValue(RS_156_catalog_sales_cs_item_sk_max) and
in_bloom_filter(sr_item_sk, DynamicValue(RS_147_item_i_item_sk_bloom_filter))
and in_bloom_filter(sr_item_sk,
DynamicValue(RS_156_catalog_sales_cs_item_sk_bloom_filter))) (type: boolean)
Statistics: Num rows: 8634166995 Data size: 138146671920
Basic stats: COMPLETE Column stats: COMPLETE
Select Operator
expressions: sr_item_sk (type: bigint), sr_ticket_number
(type: bigint)
@@ -511,7 +513,7 @@ STAGE PLANS:
Map-reduce partition columns: _col0 (type: bigint),
_col1 (type: bigint)
Statistics: Num rows: 8634166995 Data size:
138146671920 Basic stats: COMPLETE Column stats: COMPLETE
Filter Operator
- predicate: (sr_item_sk BETWEEN
DynamicValue(RS_49_item_i_item_sk_min) AND
DynamicValue(RS_49_item_i_item_sk_max) and in_bloom_filter(sr_item_sk,
DynamicValue(RS_49_item_i_item_sk_bloom_filter))) (type: boolean)
+ predicate: (sr_item_sk BETWEEN
DynamicValue(RS_49_item_i_item_sk_min) AND
DynamicValue(RS_49_item_i_item_sk_max) and sr_item_sk BETWEEN
DynamicValue(RS_58_catalog_sales_cs_item_sk_min) AND
DynamicValue(RS_58_catalog_sales_cs_item_sk_max) and
in_bloom_filter(sr_item_sk, DynamicValue(RS_49_item_i_item_sk_bloom_filter))
and in_bloom_filter(sr_item_sk,
DynamicValue(RS_58_catalog_sales_cs_item_sk_bloom_filter))) (type: boolean)
Statistics: Num rows: 8634166995 Data size: 138146671920
Basic stats: COMPLETE Column stats: COMPLETE
Select Operator
expressions: sr_item_sk (type: bigint), sr_ticket_number
(type: bigint)
@@ -562,6 +564,34 @@ STAGE PLANS:
Execution mode: vectorized, llap
LLAP IO: may be used (ACID table)
Reducer 10
+ Execution mode: vectorized, llap
+ Reduce Operator Tree:
+ Map Join Operator
+ condition map:
+ Inner Join 0 to 1
+ keys:
+ 0 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1
(type: bigint)
+ 1 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1
(type: bigint)
+ outputColumnNames: _col0, _col2, _col5
+ input vertices:
+ 1 Map 13
+ Statistics: Num rows: 41876960211 Data size: 9691486353656
Basic stats: COMPLETE Column stats: COMPLETE
+ DynamicPartitionHashJoin: true
+ Group By Operator
+ aggregations: sum(_col2), sum(_col5)
+ keys: _col0 (type: bigint)
+ minReductionHashAggr: 0.99
+ mode: hash
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 16946565830 Data size: 3931603272560
Basic stats: COMPLETE Column stats: COMPLETE
+ Reduce Output Operator
+ key expressions: _col0 (type: bigint)
+ null sort order: z
+ sort order: +
+ Map-reduce partition columns: _col0 (type: bigint)
+ Statistics: Num rows: 16946565830 Data size: 3931603272560
Basic stats: COMPLETE Column stats: COMPLETE
+ value expressions: _col1 (type: decimal(17,2)), _col2
(type: decimal(19,2))
+ Reducer 11
Execution mode: vectorized, llap
Reduce Operator Tree:
Group By Operator
@@ -583,19 +613,40 @@ STAGE PLANS:
sort order: +
Map-reduce partition columns: _col0 (type: bigint)
Statistics: Num rows: 149211 Data size: 1193688 Basic
stats: COMPLETE Column stats: COMPLETE
- Reducer 15
+ Select Operator
+ expressions: _col0 (type: bigint)
+ outputColumnNames: _col0
+ Statistics: Num rows: 149211 Data size: 1193688 Basic
stats: COMPLETE Column stats: COMPLETE
+ Group By Operator
+ aggregations: min(_col0), max(_col0),
bloom_filter(_col0, expectedEntries=1000000)
+ minReductionHashAggr: 0.99
+ mode: hash
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 1 Data size: 160 Basic stats:
COMPLETE Column stats: COMPLETE
+ Reduce Output Operator
+ null sort order:
+ sort order:
+ Statistics: Num rows: 1 Data size: 160 Basic stats:
COMPLETE Column stats: COMPLETE
+ value expressions: _col0 (type: bigint), _col1
(type: bigint), _col2 (type: binary)
+ Reducer 12
Execution mode: vectorized, llap
Reduce Operator Tree:
- Select Operator
- expressions: KEY.reducesinkkey0 (type: bigint)
- outputColumnNames: _col0
+ Group By Operator
+ aggregations: min(VALUE._col0), max(VALUE._col1),
bloom_filter(VALUE._col2, 1, expectedEntries=1000000)
+ mode: final
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 1 Data size: 160 Basic stats: COMPLETE
Column stats: COMPLETE
Reduce Output Operator
- key expressions: _col0 (type: bigint)
- null sort order: z
- sort order: +
- Map-reduce partition columns: _col0 (type: bigint)
- Statistics: Num rows: 7200 Data size: 57600 Basic stats:
COMPLETE Column stats: COMPLETE
- Reducer 16
+ null sort order:
+ sort order:
+ Statistics: Num rows: 1 Data size: 160 Basic stats: COMPLETE
Column stats: COMPLETE
+ value expressions: _col0 (type: bigint), _col1 (type:
bigint), _col2 (type: binary)
+ Reduce Output Operator
+ null sort order:
+ sort order:
+ Statistics: Num rows: 1 Data size: 160 Basic stats: COMPLETE
Column stats: COMPLETE
+ value expressions: _col0 (type: bigint), _col1 (type:
bigint), _col2 (type: binary)
+ Reducer 17
Execution mode: vectorized, llap
Reduce Operator Tree:
Select Operator
@@ -611,28 +662,14 @@ STAGE PLANS:
Execution mode: vectorized, llap
Reduce Operator Tree:
Select Operator
- expressions: KEY.reducesinkkey0 (type: bigint), VALUE._col0
(type: int)
- outputColumnNames: _col0, _col1
- Reduce Output Operator
- key expressions: _col0 (type: bigint)
- null sort order: z
- sort order: +
- Map-reduce partition columns: _col0 (type: bigint)
- Statistics: Num rows: 73049 Data size: 876588 Basic stats:
COMPLETE Column stats: COMPLETE
- value expressions: _col1 (type: int)
- Reducer 19
- Execution mode: vectorized, llap
- Reduce Operator Tree:
- Select Operator
- expressions: KEY.reducesinkkey0 (type: bigint), VALUE._col0
(type: int)
- outputColumnNames: _col0, _col1
+ expressions: KEY.reducesinkkey0 (type: bigint)
+ outputColumnNames: _col0
Reduce Output Operator
key expressions: _col0 (type: bigint)
null sort order: z
sort order: +
Map-reduce partition columns: _col0 (type: bigint)
- Statistics: Num rows: 73049 Data size: 876588 Basic stats:
COMPLETE Column stats: COMPLETE
- value expressions: _col1 (type: int)
+ Statistics: Num rows: 7200 Data size: 57600 Basic stats:
COMPLETE Column stats: COMPLETE
Reducer 2
Execution mode: vectorized, llap
Reduce Operator Tree:
@@ -644,7 +681,7 @@ STAGE PLANS:
1 KEY.reducesinkkey0 (type: bigint)
outputColumnNames: _col0, _col2, _col3, _col4, _col5, _col6,
_col7, _col8, _col9, _col11, _col15, _col16, _col17, _col18, _col19
input vertices:
- 1 Map 32
+ 1 Map 34
Statistics: Num rows: 226670367 Data size: 14429217296 Basic
stats: COMPLETE Column stats: COMPLETE
DynamicPartitionHashJoin: true
Map Join Operator
@@ -665,7 +702,7 @@ STAGE PLANS:
1 _col0 (type: bigint)
outputColumnNames: _col0, _col2, _col3, _col5, _col6,
_col7, _col8, _col9, _col11, _col15, _col16, _col17, _col18, _col19, _col22,
_col23, _col24, _col25
input vertices:
- 1 Map 12
+ 1 Map 14
Statistics: Num rows: 226670367 Data size: 96257219775
Basic stats: COMPLETE Column stats: COMPLETE
Reduce Output Operator
key expressions: _col0 (type: bigint), _col6 (type:
bigint)
@@ -674,7 +711,33 @@ STAGE PLANS:
Map-reduce partition columns: _col0 (type: bigint),
_col6 (type: bigint)
Statistics: Num rows: 226670367 Data size: 96257219775
Basic stats: COMPLETE Column stats: COMPLETE
value expressions: _col2 (type: bigint), _col3 (type:
bigint), _col5 (type: bigint), _col7 (type: decimal(7,2)), _col8 (type:
decimal(7,2)), _col9 (type: decimal(7,2)), _col11 (type: bigint), _col15 (type:
bigint), _col16 (type: bigint), _col17 (type: bigint), _col18 (type: bigint),
_col19 (type: bigint), _col22 (type: char(10)), _col23 (type: varchar(60)),
_col24 (type: varchar(60)), _col25 (type: char(10))
+ Reducer 20
+ Execution mode: vectorized, llap
+ Reduce Operator Tree:
+ Select Operator
+ expressions: KEY.reducesinkkey0 (type: bigint), VALUE._col0
(type: int)
+ outputColumnNames: _col0, _col1
+ Reduce Output Operator
+ key expressions: _col0 (type: bigint)
+ null sort order: z
+ sort order: +
+ Map-reduce partition columns: _col0 (type: bigint)
+ Statistics: Num rows: 73049 Data size: 876588 Basic stats:
COMPLETE Column stats: COMPLETE
+ value expressions: _col1 (type: int)
Reducer 21
+ Execution mode: vectorized, llap
+ Reduce Operator Tree:
+ Select Operator
+ expressions: KEY.reducesinkkey0 (type: bigint), VALUE._col0
(type: int)
+ outputColumnNames: _col0, _col1
+ Reduce Output Operator
+ key expressions: _col0 (type: bigint)
+ null sort order: z
+ sort order: +
+ Map-reduce partition columns: _col0 (type: bigint)
+ Statistics: Num rows: 73049 Data size: 876588 Basic stats:
COMPLETE Column stats: COMPLETE
+ value expressions: _col1 (type: int)
+ Reducer 23
Execution mode: vectorized, llap
Reduce Operator Tree:
Select Operator
@@ -687,7 +750,7 @@ STAGE PLANS:
Map-reduce partition columns: _col0 (type: bigint)
Statistics: Num rows: 1920800 Data size: 178634400 Basic
stats: COMPLETE Column stats: COMPLETE
value expressions: _col1 (type: char(1))
- Reducer 22
+ Reducer 24
Execution mode: vectorized, llap
Reduce Operator Tree:
Select Operator
@@ -700,7 +763,7 @@ STAGE PLANS:
Map-reduce partition columns: _col0 (type: bigint)
Statistics: Num rows: 1920800 Data size: 178634400 Basic
stats: COMPLETE Column stats: COMPLETE
value expressions: _col1 (type: char(1))
- Reducer 24
+ Reducer 26
Execution mode: vectorized, llap
Reduce Operator Tree:
Map Join Operator
@@ -711,7 +774,7 @@ STAGE PLANS:
1 KEY.reducesinkkey0 (type: bigint)
outputColumnNames: _col0, _col2, _col3, _col4, _col5, _col6,
_col7, _col8, _col9, _col11, _col12, _col15, _col16, _col17, _col18, _col19
input vertices:
- 1 Map 32
+ 1 Map 34
Statistics: Num rows: 226670367 Data size: 38682946565 Basic
stats: COMPLETE Column stats: COMPLETE
DynamicPartitionHashJoin: true
Map Join Operator
@@ -722,7 +785,7 @@ STAGE PLANS:
1 _col0 (type: bigint)
outputColumnNames: _col0, _col2, _col3, _col4, _col5, _col6,
_col7, _col8, _col9, _col11, _col12, _col15, _col16, _col17, _col18, _col19
input vertices:
- 1 Reducer 10
+ 1 Reducer 11
Statistics: Num rows: 226670367 Data size: 38682946565 Basic
stats: COMPLETE Column stats: COMPLETE
Map Join Operator
condition map:
@@ -732,7 +795,7 @@ STAGE PLANS:
1 _col0 (type: bigint)
outputColumnNames: _col0, _col2, _col3, _col5, _col6,
_col7, _col8, _col9, _col11, _col12, _col15, _col16, _col17, _col18, _col19,
_col22, _col23, _col24, _col25
input vertices:
- 1 Map 12
+ 1 Map 14
Statistics: Num rows: 226670367 Data size: 120510949044
Basic stats: COMPLETE Column stats: COMPLETE
Reduce Output Operator
key expressions: _col0 (type: bigint), _col6 (type:
bigint)
@@ -741,7 +804,7 @@ STAGE PLANS:
Map-reduce partition columns: _col0 (type: bigint),
_col6 (type: bigint)
Statistics: Num rows: 226670367 Data size: 120510949044
Basic stats: COMPLETE Column stats: COMPLETE
value expressions: _col2 (type: bigint), _col3 (type:
bigint), _col5 (type: bigint), _col7 (type: decimal(7,2)), _col8 (type:
decimal(7,2)), _col9 (type: decimal(7,2)), _col11 (type: bigint), _col12 (type:
char(50)), _col15 (type: bigint), _col16 (type: bigint), _col17 (type: bigint),
_col18 (type: bigint), _col19 (type: bigint), _col22 (type: char(10)), _col23
(type: varchar(60)), _col24 (type: varchar(60)), _col25 (type: char(10))
- Reducer 25
+ Reducer 27
Execution mode: vectorized, llap
Reduce Operator Tree:
Map Join Operator
@@ -752,7 +815,7 @@ STAGE PLANS:
1 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1
(type: bigint)
outputColumnNames: _col2, _col3, _col5, _col7, _col8, _col9,
_col11, _col12, _col15, _col16, _col17, _col18, _col19, _col22, _col23, _col24,
_col25
input vertices:
- 1 Map 33
+ 1 Map 35
Statistics: Num rows: 382653083 Data size: 253525082388 Basic
stats: COMPLETE Column stats: COMPLETE
DynamicPartitionHashJoin: true
Map Join Operator
@@ -763,7 +826,7 @@ STAGE PLANS:
1 _col0 (type: bigint)
outputColumnNames: _col2, _col3, _col7, _col8, _col9,
_col11, _col12, _col15, _col16, _col17, _col18, _col19, _col22, _col23, _col24,
_col25, _col29, _col30
input vertices:
- 1 Map 13
+ 1 Map 15
Statistics: Num rows: 382653083 Data size: 320006816343
Basic stats: COMPLETE Column stats: COMPLETE
Map Join Operator
condition map:
@@ -773,7 +836,7 @@ STAGE PLANS:
1 _col0 (type: bigint)
outputColumnNames: _col2, _col7, _col8, _col9, _col11,
_col12, _col15, _col16, _col17, _col18, _col19, _col22, _col23, _col24, _col25,
_col29, _col30
input vertices:
- 1 Reducer 16
+ 1 Reducer 18
Statistics: Num rows: 382653083 Data size: 318758954607
Basic stats: COMPLETE Column stats: COMPLETE
Map Join Operator
condition map:
@@ -783,7 +846,7 @@ STAGE PLANS:
1 _col0 (type: bigint)
outputColumnNames: _col2, _col7, _col8, _col9, _col11,
_col12, _col15, _col17, _col18, _col19, _col22, _col23, _col24, _col25, _col29,
_col30
input vertices:
- 1 Map 14
+ 1 Map 16
Statistics: Num rows: 382653083 Data size: 315717148415
Basic stats: COMPLETE Column stats: COMPLETE
Map Join Operator
condition map:
@@ -793,7 +856,7 @@ STAGE PLANS:
1 _col0 (type: bigint)
outputColumnNames: _col2, _col7, _col8, _col9, _col11,
_col12, _col15, _col17, _col18, _col22, _col23, _col24, _col25, _col29, _col30,
_col34
input vertices:
- 1 Reducer 19
+ 1 Reducer 21
Statistics: Num rows: 382653083 Data size:
314205958075 Basic stats: COMPLETE Column stats: COMPLETE
Map Join Operator
condition map:
@@ -803,7 +866,7 @@ STAGE PLANS:
1 _col0 (type: bigint)
outputColumnNames: _col2, _col7, _col8, _col9,
_col11, _col12, _col15, _col17, _col22, _col23, _col24, _col25, _col29, _col30,
_col34, _col36
input vertices:
- 1 Map 17
+ 1 Map 19
Statistics: Num rows: 382653083 Data size:
312694776079 Basic stats: COMPLETE Column stats: COMPLETE
Map Join Operator
condition map:
@@ -813,7 +876,7 @@ STAGE PLANS:
1 _col0 (type: bigint)
outputColumnNames: _col7, _col8, _col9, _col11,
_col12, _col15, _col17, _col22, _col23, _col24, _col25, _col29, _col30, _col34,
_col36, _col38
input vertices:
- 1 Reducer 22
+ 1 Reducer 24
Statistics: Num rows: 382653083 Data size:
343972426398 Basic stats: COMPLETE Column stats: COMPLETE
Map Join Operator
condition map:
@@ -823,7 +886,7 @@ STAGE PLANS:
1 _col0 (type: bigint)
outputColumnNames: _col7, _col8, _col9, _col11,
_col12, _col17, _col22, _col23, _col24, _col25, _col29, _col30, _col34, _col36,
_col38, _col40
input vertices:
- 1 Map 20
+ 1 Map 22
Statistics: Num rows: 382653083 Data size:
373456129549 Basic stats: COMPLETE Column stats: COMPLETE
Filter Operator
predicate: (_col38 <> _col40) (type: boolean)
@@ -836,7 +899,7 @@ STAGE PLANS:
1 _col0 (type: bigint)
outputColumnNames: _col7, _col8, _col9,
_col11, _col12, _col22, _col23, _col24, _col25, _col29, _col30, _col34, _col36,
_col42, _col43, _col44, _col45
input vertices:
- 1 Map 12
+ 1 Map 14
Statistics: Num rows: 382653083 Data size:
443481643738 Basic stats: COMPLETE Column stats: COMPLETE
Group By Operator
aggregations: count(), sum(_col7),
sum(_col8), sum(_col9)
@@ -852,7 +915,7 @@ STAGE PLANS:
Map-reduce partition columns: _col0
(type: varchar(50)), _col1 (type: bigint), _col2 (type: char(10)), _col3 (type:
char(10)), _col4 (type: varchar(60)), _col5 (type: varchar(60)), _col6 (type:
char(10)), _col7 (type: char(50)), _col8 (type: int), _col9 (type: int), _col10
(type: char(10)), _col11 (type: varchar(60)), _col12 (type: varchar(60)),
_col13 (type: char(10))
Statistics: Num rows: 382653083 Data
size: 522704111378 Basic stats: COMPLETE Column stats: COMPLETE
value expressions: _col14 (type:
bigint), _col15 (type: decimal(17,2)), _col16 (type: decimal(17,2)), _col17
(type: decimal(17,2))
- Reducer 26
+ Reducer 28
Execution mode: vectorized, llap
Reduce Operator Tree:
Group By Operator
@@ -879,7 +942,7 @@ STAGE PLANS:
Map-reduce partition columns: _col2 (type:
varchar(50)), _col1 (type: bigint), _col3 (type: char(10))
Statistics: Num rows: 382653083 Data size:
519642886714 Basic stats: COMPLETE Column stats: COMPLETE
value expressions: _col0 (type: char(50)), _col4
(type: char(10)), _col5 (type: varchar(60)), _col6 (type: varchar(60)), _col7
(type: char(10)), _col8 (type: char(10)), _col9 (type: varchar(60)), _col10
(type: varchar(60)), _col11 (type: char(10)), _col12 (type: bigint), _col13
(type: decimal(17,2)), _col14 (type: decimal(17,2)), _col15 (type:
decimal(17,2))
- Reducer 27
+ Reducer 29
Execution mode: vectorized, llap
Reduce Operator Tree:
Map Join Operator
@@ -906,20 +969,6 @@ STAGE PLANS:
sort order: +++
Statistics: Num rows: 33315900325234 Data size:
56437135150946396 Basic stats: COMPLETE Column stats: COMPLETE
value expressions: _col2 (type: char(10)), _col3 (type:
char(10)), _col4 (type: varchar(60)), _col5 (type: varchar(60)), _col6 (type:
char(10)), _col7 (type: char(10)), _col8 (type: varchar(60)), _col9 (type:
varchar(60)), _col10 (type: char(10)), _col11 (type: bigint), _col12 (type:
decimal(17,2)), _col13 (type: decimal(17,2)), _col14 (type: decimal(17,2)),
_col15 (type: decimal(17,2)), _col16 (type: decimal(17,2)), _col17 (type:
decimal(17,2))
- Reducer 28
- Execution mode: vectorized, llap
- Reduce Operator Tree:
- Select Operator
- expressions: KEY.reducesinkkey0 (type: char(50)),
KEY.reducesinkkey1 (type: varchar(50)), VALUE._col0 (type: char(10)),
VALUE._col1 (type: char(10)), VALUE._col2 (type: varchar(60)), VALUE._col3
(type: varchar(60)), VALUE._col4 (type: char(10)), VALUE._col5 (type:
char(10)), VALUE._col6 (type: varchar(60)), VALUE._col7 (type: varchar(60)),
VALUE._col8 (type: char(10)), 2000 (type: int), VALUE._col9 (type: bigint),
VALUE._col10 (type: decimal(17,2)), VALUE._col11 (type: de [...]
- outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5,
_col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15,
_col16, _col17, _col18, _col19, _col20
- Statistics: Num rows: 33315900325234 Data size:
56703662353548268 Basic stats: COMPLETE Column stats: COMPLETE
- File Output Operator
- compressed: false
- Statistics: Num rows: 33315900325234 Data size:
56703662353548268 Basic stats: COMPLETE Column stats: COMPLETE
- table:
- input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
- output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
- serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Reducer 3
Execution mode: vectorized, llap
Reduce Operator Tree:
@@ -931,7 +980,7 @@ STAGE PLANS:
1 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1
(type: bigint)
outputColumnNames: _col2, _col3, _col5, _col7, _col8, _col9,
_col11, _col15, _col16, _col17, _col18, _col19, _col22, _col23, _col24, _col25
input vertices:
- 1 Map 33
+ 1 Map 35
Statistics: Num rows: 382653083 Data size: 212581202507 Basic
stats: COMPLETE Column stats: COMPLETE
DynamicPartitionHashJoin: true
Map Join Operator
@@ -942,7 +991,7 @@ STAGE PLANS:
1 _col0 (type: bigint)
outputColumnNames: _col2, _col3, _col7, _col8, _col9,
_col11, _col15, _col16, _col17, _col18, _col19, _col22, _col23, _col24, _col25,
_col29, _col30
input vertices:
- 1 Map 13
+ 1 Map 15
Statistics: Num rows: 382653083 Data size: 279062936462
Basic stats: COMPLETE Column stats: COMPLETE
Map Join Operator
condition map:
@@ -952,7 +1001,7 @@ STAGE PLANS:
1 _col0 (type: bigint)
outputColumnNames: _col2, _col7, _col8, _col9, _col11,
_col15, _col16, _col17, _col18, _col19, _col22, _col23, _col24, _col25, _col29,
_col30
input vertices:
- 1 Reducer 15
+ 1 Reducer 17
Statistics: Num rows: 382653083 Data size: 277815074726
Basic stats: COMPLETE Column stats: COMPLETE
Map Join Operator
condition map:
@@ -962,7 +1011,7 @@ STAGE PLANS:
1 _col0 (type: bigint)
outputColumnNames: _col2, _col7, _col8, _col9, _col11,
_col15, _col17, _col18, _col19, _col22, _col23, _col24, _col25, _col29, _col30
input vertices:
- 1 Map 14
+ 1 Map 16
Statistics: Num rows: 382653083 Data size: 274773268534
Basic stats: COMPLETE Column stats: COMPLETE
Map Join Operator
condition map:
@@ -972,7 +1021,7 @@ STAGE PLANS:
1 _col0 (type: bigint)
outputColumnNames: _col2, _col7, _col8, _col9, _col11,
_col15, _col17, _col18, _col22, _col23, _col24, _col25, _col29, _col30, _col34
input vertices:
- 1 Reducer 18
+ 1 Reducer 20
Statistics: Num rows: 382653083 Data size:
273262078194 Basic stats: COMPLETE Column stats: COMPLETE
Map Join Operator
condition map:
@@ -982,7 +1031,7 @@ STAGE PLANS:
1 _col0 (type: bigint)
outputColumnNames: _col2, _col7, _col8, _col9,
_col11, _col15, _col17, _col22, _col23, _col24, _col25, _col29, _col30, _col34,
_col36
input vertices:
- 1 Map 17
+ 1 Map 19
Statistics: Num rows: 382653083 Data size:
271750896198 Basic stats: COMPLETE Column stats: COMPLETE
Map Join Operator
condition map:
@@ -992,7 +1041,7 @@ STAGE PLANS:
1 _col0 (type: bigint)
outputColumnNames: _col7, _col8, _col9, _col11,
_col15, _col17, _col22, _col23, _col24, _col25, _col29, _col30, _col34, _col36,
_col38
input vertices:
- 1 Reducer 21
+ 1 Reducer 23
Statistics: Num rows: 382653083 Data size:
303028546517 Basic stats: COMPLETE Column stats: COMPLETE
Map Join Operator
condition map:
@@ -1002,7 +1051,7 @@ STAGE PLANS:
1 _col0 (type: bigint)
outputColumnNames: _col7, _col8, _col9, _col11,
_col17, _col22, _col23, _col24, _col25, _col29, _col30, _col34, _col36, _col38,
_col40
input vertices:
- 1 Map 20
+ 1 Map 22
Statistics: Num rows: 382653083 Data size:
332512249668 Basic stats: COMPLETE Column stats: COMPLETE
Filter Operator
predicate: (_col38 <> _col40) (type: boolean)
@@ -1015,7 +1064,7 @@ STAGE PLANS:
1 _col0 (type: bigint)
outputColumnNames: _col7, _col8, _col9,
_col11, _col22, _col23, _col24, _col25, _col29, _col30, _col34, _col36, _col42,
_col43, _col44, _col45
input vertices:
- 1 Map 12
+ 1 Map 14
Statistics: Num rows: 382653083 Data size:
402537763857 Basic stats: COMPLETE Column stats: COMPLETE
Group By Operator
aggregations: count(), sum(_col7),
sum(_col8), sum(_col9)
@@ -1032,6 +1081,20 @@ STAGE PLANS:
Statistics: Num rows: 382653083 Data
size: 481760231497 Basic stats: COMPLETE Column stats: COMPLETE
value expressions: _col13 (type:
bigint), _col14 (type: decimal(17,2)), _col15 (type: decimal(17,2)), _col16
(type: decimal(17,2))
Reducer 30
+ Execution mode: vectorized, llap
+ Reduce Operator Tree:
+ Select Operator
+ expressions: KEY.reducesinkkey0 (type: char(50)),
KEY.reducesinkkey1 (type: varchar(50)), VALUE._col0 (type: char(10)),
VALUE._col1 (type: char(10)), VALUE._col2 (type: varchar(60)), VALUE._col3
(type: varchar(60)), VALUE._col4 (type: char(10)), VALUE._col5 (type:
char(10)), VALUE._col6 (type: varchar(60)), VALUE._col7 (type: varchar(60)),
VALUE._col8 (type: char(10)), 2000 (type: int), VALUE._col9 (type: bigint),
VALUE._col10 (type: decimal(17,2)), VALUE._col11 (type: de [...]
+ outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5,
_col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15,
_col16, _col17, _col18, _col19, _col20
+ Statistics: Num rows: 33315900325234 Data size:
56703662353548268 Basic stats: COMPLETE Column stats: COMPLETE
+ File Output Operator
+ compressed: false
+ Statistics: Num rows: 33315900325234 Data size:
56703662353548268 Basic stats: COMPLETE Column stats: COMPLETE
+ table:
+ input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
+ output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
+ serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ Reducer 32
Execution mode: vectorized, llap
Reduce Operator Tree:
Group By Operator
@@ -1054,7 +1117,7 @@ STAGE PLANS:
sort order:
Statistics: Num rows: 1 Data size: 160 Basic stats: COMPLETE
Column stats: COMPLETE
value expressions: _col0 (type: bigint), _col1 (type:
bigint), _col2 (type: binary)
- Reducer 31
+ Reducer 33
Execution mode: vectorized, llap
Reduce Operator Tree:
Group By Operator
@@ -1143,7 +1206,7 @@ STAGE PLANS:
1 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1
(type: bigint)
outputColumnNames: _col0, _col2, _col5
input vertices:
- 1 Map 11
+ 1 Map 13
Statistics: Num rows: 41876960211 Data size: 9691486353656
Basic stats: COMPLETE Column stats: COMPLETE
DynamicPartitionHashJoin: true
Group By Operator
@@ -1182,34 +1245,39 @@ STAGE PLANS:
sort order: +
Map-reduce partition columns: _col0 (type: bigint)
Statistics: Num rows: 149211 Data size: 1193688 Basic
stats: COMPLETE Column stats: COMPLETE
+ Select Operator
+ expressions: _col0 (type: bigint)
+ outputColumnNames: _col0
+ Statistics: Num rows: 149211 Data size: 1193688 Basic
stats: COMPLETE Column stats: COMPLETE
+ Group By Operator
+ aggregations: min(_col0), max(_col0),
bloom_filter(_col0, expectedEntries=1000000)
+ minReductionHashAggr: 0.99
+ mode: hash
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 1 Data size: 160 Basic stats:
COMPLETE Column stats: COMPLETE
+ Reduce Output Operator
+ null sort order:
+ sort order:
+ Statistics: Num rows: 1 Data size: 160 Basic stats:
COMPLETE Column stats: COMPLETE
+ value expressions: _col0 (type: bigint), _col1
(type: bigint), _col2 (type: binary)
Reducer 9
Execution mode: vectorized, llap
Reduce Operator Tree:
- Map Join Operator
- condition map:
- Inner Join 0 to 1
- keys:
- 0 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1
(type: bigint)
- 1 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1
(type: bigint)
- outputColumnNames: _col0, _col2, _col5
- input vertices:
- 1 Map 11
- Statistics: Num rows: 41876960211 Data size: 9691486353656
Basic stats: COMPLETE Column stats: COMPLETE
- DynamicPartitionHashJoin: true
- Group By Operator
- aggregations: sum(_col2), sum(_col5)
- keys: _col0 (type: bigint)
- minReductionHashAggr: 0.99
- mode: hash
- outputColumnNames: _col0, _col1, _col2
- Statistics: Num rows: 16946565830 Data size: 3931603272560
Basic stats: COMPLETE Column stats: COMPLETE
- Reduce Output Operator
- key expressions: _col0 (type: bigint)
- null sort order: z
- sort order: +
- Map-reduce partition columns: _col0 (type: bigint)
- Statistics: Num rows: 16946565830 Data size: 3931603272560
Basic stats: COMPLETE Column stats: COMPLETE
- value expressions: _col1 (type: decimal(17,2)), _col2
(type: decimal(19,2))
+ Group By Operator
+ aggregations: min(VALUE._col0), max(VALUE._col1),
bloom_filter(VALUE._col2, 1, expectedEntries=1000000)
+ mode: final
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 1 Data size: 160 Basic stats: COMPLETE
Column stats: COMPLETE
+ Reduce Output Operator
+ null sort order:
+ sort order:
+ Statistics: Num rows: 1 Data size: 160 Basic stats: COMPLETE
Column stats: COMPLETE
+ value expressions: _col0 (type: bigint), _col1 (type:
bigint), _col2 (type: binary)
+ Reduce Output Operator
+ null sort order:
+ sort order:
+ Statistics: Num rows: 1 Data size: 160 Basic stats: COMPLETE
Column stats: COMPLETE
+ value expressions: _col0 (type: bigint), _col1 (type:
bigint), _col2 (type: binary)
Stage: Stage-0
Fetch Operator
diff --git
a/ql/src/test/results/clientpositive/perf/tpcds30tb/tez/query80.q.out
b/ql/src/test/results/clientpositive/perf/tpcds30tb/tez/query80.q.out
index 180d3d7fa07..86ce0891d9b 100644
--- a/ql/src/test/results/clientpositive/perf/tpcds30tb/tez/query80.q.out
+++ b/ql/src/test/results/clientpositive/perf/tpcds30tb/tez/query80.q.out
@@ -7,10 +7,10 @@ STAGE PLANS:
Tez
#### A masked pattern was here ####
Edges:
- Map 1 <- Reducer 20 (BROADCAST_EDGE)
- Map 10 <- Reducer 19 (BROADCAST_EDGE)
+ Map 1 <- Reducer 16 (BROADCAST_EDGE), Reducer 20 (BROADCAST_EDGE)
+ Map 10 <- Reducer 15 (BROADCAST_EDGE), Reducer 19 (BROADCAST_EDGE)
Map 13 <- Reducer 15 (BROADCAST_EDGE)
- Map 23 <- Reducer 21 (BROADCAST_EDGE)
+ Map 23 <- Reducer 17 (BROADCAST_EDGE), Reducer 21 (BROADCAST_EDGE)
Map 26 <- Reducer 17 (BROADCAST_EDGE)
Map 7 <- Reducer 16 (BROADCAST_EDGE)
Reducer 11 <- Map 10 (CUSTOM_SIMPLE_EDGE), Map 13
(CUSTOM_SIMPLE_EDGE), Map 14 (BROADCAST_EDGE), Map 18 (BROADCAST_EDGE), Map 22
(BROADCAST_EDGE), Map 8 (BROADCAST_EDGE)
@@ -33,10 +33,10 @@ STAGE PLANS:
Map Operator Tree:
TableScan
alias: store_sales
- filterExpr: (ss_store_sk is not null and ss_promo_sk is not
null and ss_promo_sk BETWEEN DynamicValue(RS_23_promotion_p_promo_sk_min) AND
DynamicValue(RS_23_promotion_p_promo_sk_max) and in_bloom_filter(ss_promo_sk,
DynamicValue(RS_23_promotion_p_promo_sk_bloom_filter))) (type: boolean)
+ filterExpr: (ss_store_sk is not null and ss_promo_sk is not
null and ss_item_sk BETWEEN DynamicValue(RS_20_item_i_item_sk_min) AND
DynamicValue(RS_20_item_i_item_sk_max) and ss_promo_sk BETWEEN
DynamicValue(RS_23_promotion_p_promo_sk_min) AND
DynamicValue(RS_23_promotion_p_promo_sk_max) and in_bloom_filter(ss_item_sk,
DynamicValue(RS_20_item_i_item_sk_bloom_filter)) and
in_bloom_filter(ss_promo_sk,
DynamicValue(RS_23_promotion_p_promo_sk_bloom_filter))) (type: boolean)
Statistics: Num rows: 82510879939 Data size: 21315868812296
Basic stats: COMPLETE Column stats: COMPLETE
Filter Operator
- predicate: (ss_store_sk is not null and ss_promo_sk is not
null and ss_promo_sk BETWEEN DynamicValue(RS_23_promotion_p_promo_sk_min) AND
DynamicValue(RS_23_promotion_p_promo_sk_max) and in_bloom_filter(ss_promo_sk,
DynamicValue(RS_23_promotion_p_promo_sk_bloom_filter))) (type: boolean)
+ predicate: (ss_store_sk is not null and ss_promo_sk is not
null and ss_promo_sk BETWEEN DynamicValue(RS_23_promotion_p_promo_sk_min) AND
DynamicValue(RS_23_promotion_p_promo_sk_max) and ss_item_sk BETWEEN
DynamicValue(RS_20_item_i_item_sk_min) AND
DynamicValue(RS_20_item_i_item_sk_max) and in_bloom_filter(ss_promo_sk,
DynamicValue(RS_23_promotion_p_promo_sk_bloom_filter)) and
in_bloom_filter(ss_item_sk, DynamicValue(RS_20_item_i_item_sk_bloom_filter)))
(type: boolean)
Statistics: Num rows: 78675502838 Data size:
20325037116048 Basic stats: COMPLETE Column stats: COMPLETE
Select Operator
expressions: ss_item_sk (type: bigint), ss_store_sk
(type: bigint), ss_promo_sk (type: bigint), ss_ticket_number (type: bigint),
ss_ext_sales_price (type: decimal(7,2)), ss_net_profit (type: decimal(7,2)),
ss_sold_date_sk (type: bigint)
@@ -55,10 +55,10 @@ STAGE PLANS:
Map Operator Tree:
TableScan
alias: catalog_sales
- filterExpr: (cs_catalog_page_sk is not null and cs_promo_sk
is not null and cs_promo_sk BETWEEN
DynamicValue(RS_60_promotion_p_promo_sk_min) AND
DynamicValue(RS_60_promotion_p_promo_sk_max) and in_bloom_filter(cs_promo_sk,
DynamicValue(RS_60_promotion_p_promo_sk_bloom_filter))) (type: boolean)
+ filterExpr: (cs_catalog_page_sk is not null and cs_promo_sk
is not null and cs_item_sk BETWEEN DynamicValue(RS_57_item_i_item_sk_min) AND
DynamicValue(RS_57_item_i_item_sk_max) and cs_promo_sk BETWEEN
DynamicValue(RS_60_promotion_p_promo_sk_min) AND
DynamicValue(RS_60_promotion_p_promo_sk_max) and in_bloom_filter(cs_item_sk,
DynamicValue(RS_57_item_i_item_sk_bloom_filter)) and
in_bloom_filter(cs_promo_sk,
DynamicValue(RS_60_promotion_p_promo_sk_bloom_filter))) (type: boolean)
Statistics: Num rows: 43005109025 Data size: 11339575410520
Basic stats: COMPLETE Column stats: COMPLETE
Filter Operator
- predicate: (cs_catalog_page_sk is not null and cs_promo_sk
is not null and cs_promo_sk BETWEEN
DynamicValue(RS_60_promotion_p_promo_sk_min) AND
DynamicValue(RS_60_promotion_p_promo_sk_max) and in_bloom_filter(cs_promo_sk,
DynamicValue(RS_60_promotion_p_promo_sk_bloom_filter))) (type: boolean)
+ predicate: (cs_catalog_page_sk is not null and cs_promo_sk
is not null and cs_promo_sk BETWEEN
DynamicValue(RS_60_promotion_p_promo_sk_min) AND
DynamicValue(RS_60_promotion_p_promo_sk_max) and cs_item_sk BETWEEN
DynamicValue(RS_57_item_i_item_sk_min) AND
DynamicValue(RS_57_item_i_item_sk_max) and in_bloom_filter(cs_promo_sk,
DynamicValue(RS_60_promotion_p_promo_sk_bloom_filter)) and
in_bloom_filter(cs_item_sk, DynamicValue(RS_57_item_i_item_sk_bloom_filter)))
(type: boolean)
Statistics: Num rows: 42789551679 Data size:
11282737308320 Basic stats: COMPLETE Column stats: COMPLETE
Select Operator
expressions: cs_catalog_page_sk (type: bigint),
cs_item_sk (type: bigint), cs_promo_sk (type: bigint), cs_order_number (type:
bigint), cs_ext_sales_price (type: decimal(7,2)), cs_net_profit (type:
decimal(7,2)), cs_sold_date_sk (type: bigint)
@@ -273,10 +273,10 @@ STAGE PLANS:
Map Operator Tree:
TableScan
alias: web_sales
- filterExpr: (ws_web_site_sk is not null and ws_promo_sk is
not null and ws_promo_sk BETWEEN DynamicValue(RS_98_promotion_p_promo_sk_min)
AND DynamicValue(RS_98_promotion_p_promo_sk_max) and
in_bloom_filter(ws_promo_sk,
DynamicValue(RS_98_promotion_p_promo_sk_bloom_filter))) (type: boolean)
+ filterExpr: (ws_web_site_sk is not null and ws_promo_sk is
not null and ws_item_sk BETWEEN DynamicValue(RS_95_item_i_item_sk_min) AND
DynamicValue(RS_95_item_i_item_sk_max) and ws_promo_sk BETWEEN
DynamicValue(RS_98_promotion_p_promo_sk_min) AND
DynamicValue(RS_98_promotion_p_promo_sk_max) and in_bloom_filter(ws_item_sk,
DynamicValue(RS_95_item_i_item_sk_bloom_filter)) and
in_bloom_filter(ws_promo_sk,
DynamicValue(RS_98_promotion_p_promo_sk_bloom_filter))) (type: boolean)
Statistics: Num rows: 21594638446 Data size: 5700638697608
Basic stats: COMPLETE Column stats: COMPLETE
Filter Operator
- predicate: (ws_web_site_sk is not null and ws_promo_sk is
not null and ws_promo_sk BETWEEN DynamicValue(RS_98_promotion_p_promo_sk_min)
AND DynamicValue(RS_98_promotion_p_promo_sk_max) and
in_bloom_filter(ws_promo_sk,
DynamicValue(RS_98_promotion_p_promo_sk_bloom_filter))) (type: boolean)
+ predicate: (ws_web_site_sk is not null and ws_promo_sk is
not null and ws_promo_sk BETWEEN DynamicValue(RS_98_promotion_p_promo_sk_min)
AND DynamicValue(RS_98_promotion_p_promo_sk_max) and ws_item_sk BETWEEN
DynamicValue(RS_95_item_i_item_sk_min) AND
DynamicValue(RS_95_item_i_item_sk_max) and in_bloom_filter(ws_promo_sk,
DynamicValue(RS_98_promotion_p_promo_sk_bloom_filter)) and
in_bloom_filter(ws_item_sk, DynamicValue(RS_95_item_i_item_sk_bloom_filter)))
(type: boolean)
Statistics: Num rows: 21589233207 Data size: 5699211801048
Basic stats: COMPLETE Column stats: COMPLETE
Select Operator
expressions: ws_item_sk (type: bigint), ws_web_site_sk
(type: bigint), ws_promo_sk (type: bigint), ws_order_number (type: bigint),
ws_ext_sales_price (type: decimal(7,2)), ws_net_profit (type: decimal(7,2)),
ws_sold_date_sk (type: bigint)
@@ -571,6 +571,11 @@ STAGE PLANS:
sort order:
Statistics: Num rows: 1 Data size: 160 Basic stats: COMPLETE
Column stats: COMPLETE
value expressions: _col0 (type: bigint), _col1 (type:
bigint), _col2 (type: binary)
+ Reduce Output Operator
+ null sort order:
+ sort order:
+ Statistics: Num rows: 1 Data size: 160 Basic stats: COMPLETE
Column stats: COMPLETE
+ value expressions: _col0 (type: bigint), _col1 (type:
bigint), _col2 (type: binary)
Reducer 16
Execution mode: vectorized, llap
Reduce Operator Tree:
@@ -584,6 +589,11 @@ STAGE PLANS:
sort order:
Statistics: Num rows: 1 Data size: 160 Basic stats: COMPLETE
Column stats: COMPLETE
value expressions: _col0 (type: bigint), _col1 (type:
bigint), _col2 (type: binary)
+ Reduce Output Operator
+ null sort order:
+ sort order:
+ Statistics: Num rows: 1 Data size: 160 Basic stats: COMPLETE
Column stats: COMPLETE
+ value expressions: _col0 (type: bigint), _col1 (type:
bigint), _col2 (type: binary)
Reducer 17
Execution mode: vectorized, llap
Reduce Operator Tree:
@@ -597,6 +607,11 @@ STAGE PLANS:
sort order:
Statistics: Num rows: 1 Data size: 160 Basic stats: COMPLETE
Column stats: COMPLETE
value expressions: _col0 (type: bigint), _col1 (type:
bigint), _col2 (type: binary)
+ Reduce Output Operator
+ null sort order:
+ sort order:
+ Statistics: Num rows: 1 Data size: 160 Basic stats: COMPLETE
Column stats: COMPLETE
+ value expressions: _col0 (type: bigint), _col1 (type:
bigint), _col2 (type: binary)
Reducer 19
Execution mode: vectorized, llap
Reduce Operator Tree: