[ 
https://issues.apache.org/jira/browse/HIVE-28449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17874909#comment-17874909
 ] 

Stamatis Zampetakis commented on HIVE-28449:
--------------------------------------------

The table below summarizes the behavior changes before and after PR#5374 for a 
comparison expression of the form {{{}E1 op E2{}}}, where op is a comparison 
operator (=,<,<=,>=,>,!=), E1 is a column reference expression, E2 is a 
constant/literal holding a numeric value.
||Case||E1 type||E2 type||Before||After||Example expression||
|I|BIGINT|STRING|ERROR/WARN|OK|c_bigint = '9223372036854775807'|
|II|DECIMAL|STRING|ERROR/WARN|OK|c_decimal_19_0 = '9223372036854775807'|
|III|DOUBLE|BIGINT|ERROR/WARN|OK|c_double = 9223372036854775807|

In a nutshell, the change will remove some compilation ERROR/WARNING messages 
for the above combinations but everything else (including the query plan) will 
remain unaltered.

For cases I, and II, the ERROR/WARN message is misleading since there is no 
information/precision loss on any side of the comparison.

For case III, the ERROR/WARN message is valid since the constant will be 
converted to DOUBLE and some digits will be truncated (Java long to double).

The change in PR#5374 addresses the unintentional behavior changes introduced 
by HIVE-23100 but at the same time weakens strict type checking (case III) and 
complicates the semantics of the "hive.strict.checks.type.safety" property. 
Moreover, it leads to more behavior changes (between Hive 4.0.0 and Hive 4.1.0) 
which might not be received well by all users.

Given that there are both pros and cons with the proposed changes here, I am 
more inclined to void this ticket and accept the existing behavior where strict 
type comparisons are done before any kind of type inference but I am fully open 
to other opinions as well. If the majority feels that the positives outweigh 
the negatives please leave a comment and review the PR.

> Infer constant types from columns before strict type validation
> ---------------------------------------------------------------
>
>                 Key: HIVE-28449
>                 URL: https://issues.apache.org/jira/browse/HIVE-28449
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Planning
>         Environment: 
>            Reporter: Stamatis Zampetakis
>            Assignee: Stamatis Zampetakis
>            Priority: Major
>              Labels: backwards-compatibility
>
> HIVE-2249 introduced some [specialized type inference 
> logic|https://github.com/apache/hive/blob/5cbffb532a586226500abc498d6505722d62234d/ql/src/java/org/apache/hadoop/hive/ql/parse/type/TypeCheckProcFactory.java#L972]
>  that kicks in when there are comparisons between columns and numeric 
> constant expressions.
> Consider for instance a comparison between a BIGINT column and a STRING 
> constant.
> {code:sql}
> SELECT * FROM table WHERE c_bigint = '9223372036854775807'
> {code}
> The type derivation logic will attempt to convert the STRING constant to 
> BIGINT and evaluate the expression by comparing long values.
> Currently (commit 5cbffb532a586226500abc498d6505722d62234d), the query above 
> throws the following ERROR/WARNING:
> {noformat}
> Comparing bigint and string may result in loss of information.
> {noformat}
> This is due to strict type checking (controlled via 
> hive.strict.checks.type.safety property) that is now applied before the 
> constant type inference logic described above.
> In this case, the ERROR/WARNING is a bit misleading since there is no real 
> risk for losing precision/information since the STRING constant fits into a 
> BIGINT (Java long) and the whole comparison can be evaluated without 
> precision loss.
> For quite some time, strict type checking was performed *after* constant type 
> inference (and not *before*) but the behavior was changed unintentionally by 
> HIVE-23100.
> The goal of this change is to perform constant type inference before strict 
> type validation (behavior before HIVE-23100) to restore backward 
> compatibility and remove some unnecessary warnings/errors during compilation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to