zabetak commented on code in PR #5505:
URL: https://github.com/apache/hive/pull/5505#discussion_r1925192558
##########
ql/src/test/results/clientpositive/llap/implicit_cast_during_insert.q.out:
##########
@@ -40,60 +40,64 @@ STAGE PLANS:
Map Operator Tree:
TableScan
alias: src
- filterExpr: (key) IN (0, 1) (type: boolean)
+ filterExpr: (UDFToDouble(key)) IN (0.0D, 1.0D) (type:
boolean)
Statistics: Num rows: 500 Data size: 89000 Basic stats:
COMPLETE Column stats: COMPLETE
Filter Operator
- predicate: (key) IN (0, 1) (type: boolean)
- Statistics: Num rows: 3 Data size: 534 Basic stats:
COMPLETE Column stats: COMPLETE
+ predicate: (UDFToDouble(key)) IN (0.0D, 1.0D) (type:
boolean)
+ Statistics: Num rows: 250 Data size: 44500 Basic stats:
COMPLETE Column stats: COMPLETE
Select Operator
expressions: value (type: string), key (type: string)
outputColumnNames: _col1, _col2
- Statistics: Num rows: 3 Data size: 534 Basic stats:
COMPLETE Column stats: COMPLETE
+ Statistics: Num rows: 250 Data size: 44500 Basic stats:
COMPLETE Column stats: COMPLETE
Reduce Output Operator
key expressions: _col2 (type: string)
null sort order: z
sort order: +
Map-reduce partition columns: _col2 (type: string)
- Statistics: Num rows: 3 Data size: 534 Basic stats:
COMPLETE Column stats: COMPLETE
+ Statistics: Num rows: 250 Data size: 44500 Basic
stats: COMPLETE Column stats: COMPLETE
value expressions: _col1 (type: string)
- Execution mode: llap
+ Execution mode: vectorized, llap
LLAP IO: all inputs
Reducer 2
Execution mode: vectorized, llap
Reduce Operator Tree:
Select Operator
expressions: UDFToInteger(KEY.reducesinkkey0) (type: int),
VALUE._col0 (type: string), KEY.reducesinkkey0 (type: string)
outputColumnNames: _col0, _col1, _col2
- Statistics: Num rows: 3 Data size: 546 Basic stats: COMPLETE
Column stats: COMPLETE
- File Output Operator
- compressed: false
- Statistics: Num rows: 3 Data size: 546 Basic stats: COMPLETE
Column stats: COMPLETE
- table:
- input format:
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
- output format:
org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
- serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
- name: default.implicit_cast_during_insert
+ Statistics: Num rows: 250 Data size: 45500 Basic stats:
COMPLETE Column stats: COMPLETE
Select Operator
expressions: _col0 (type: int), _col1 (type: string), _col2
(type: string)
outputColumnNames: c1, c2, p1
- Statistics: Num rows: 3 Data size: 546 Basic stats: COMPLETE
Column stats: COMPLETE
+ Statistics: Num rows: 250 Data size: 45500 Basic stats:
COMPLETE Column stats: COMPLETE
Group By Operator
aggregations: min(c1), max(c1), count(1), count(c1),
compute_bit_vector_hll(c1), max(length(c2)), avg(COALESCE(length(c2),0)),
count(c2), compute_bit_vector_hll(c2)
keys: p1 (type: string)
mode: complete
outputColumnNames: _col0, _col1, _col2, _col3, _col4,
_col5, _col6, _col7, _col8, _col9
- Statistics: Num rows: 2 Data size: 838 Basic stats:
COMPLETE Column stats: COMPLETE
+ Statistics: Num rows: 250 Data size: 104750 Basic stats:
COMPLETE Column stats: COMPLETE
Select Operator
expressions: 'LONG' (type: string), UDFToLong(_col1)
(type: bigint), UDFToLong(_col2) (type: bigint), (_col3 - _col4) (type:
bigint), COALESCE(ndv_compute_bit_vector(_col5),0) (type: bigint), _col5 (type:
binary), 'STRING' (type: string), UDFToLong(COALESCE(_col6,0)) (type: bigint),
COALESCE(_col7,0) (type: double), (_col3 - _col8) (type: bigint),
COALESCE(ndv_compute_bit_vector(_col9),0) (type: bigint), _col9 (type: binary),
_col0 (type: string)
outputColumnNames: _col0, _col1, _col2, _col3, _col4,
_col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12
- Statistics: Num rows: 2 Data size: 1234 Basic stats:
COMPLETE Column stats: COMPLETE
+ Statistics: Num rows: 250 Data size: 154250 Basic stats:
COMPLETE Column stats: COMPLETE
File Output Operator
compressed: false
- Statistics: Num rows: 2 Data size: 1234 Basic stats:
COMPLETE Column stats: COMPLETE
+ Statistics: Num rows: 250 Data size: 154250 Basic
stats: COMPLETE Column stats: COMPLETE
table:
input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde:
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ Select Operator
+ expressions: _col0 (type: int), _col1 (type: string), _col2
(type: string)
+ outputColumnNames: _col0, _col1, _col2
+ File Output Operator
+ compressed: false
+ Dp Sort State: PARTITION_SORTED
Review Comment:
Thanks for the explanation. The stats improvement is orthogonal to this PR
so we don't need to handle it here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]