[
https://issues.apache.org/jira/browse/HIVE-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13844942#comment-13844942
]
Thejas M Nair commented on HIVE-5356:
-------------------------------------
Here are my concerns with this change (Thanks to Jason for highlighting the
differences in behavior)-
# The changes to floating point arithmetic are not backward compatible, and
there is no SQL compliance benefit for that.
# Regarding integer division returning decimal .
## It will not be backward compatible with some udf implementations ( I believe
this is same with change in floating point return type).
## Integer arithmetic becoming NULL in some cases
## more than 50x performance degradation
Regarding drive for making hive more SQL standard compliant, I believe
motivation behind it is to make it easier to integrate with external
tools and make it easier for people who are familiar with SQL to use hive. I am
not sure if change helps with either of those two motivations. Most
of the commercial databases return int result for integer division, and not
decimal (Oracle, SQL Server, DB2, postgres).
> Move arithmatic UDFs to generic UDF implementations
> ---------------------------------------------------
>
> Key: HIVE-5356
> URL: https://issues.apache.org/jira/browse/HIVE-5356
> Project: Hive
> Issue Type: Task
> Components: UDF
> Affects Versions: 0.11.0
> Reporter: Xuefu Zhang
> Assignee: Xuefu Zhang
> Fix For: 0.13.0
>
> Attachments: HIVE-5356.1.patch, HIVE-5356.10.patch,
> HIVE-5356.11.patch, HIVE-5356.12.patch, HIVE-5356.2.patch, HIVE-5356.3.patch,
> HIVE-5356.4.patch, HIVE-5356.5.patch, HIVE-5356.6.patch, HIVE-5356.7.patch,
> HIVE-5356.8.patch, HIVE-5356.9.patch
>
>
> Currently, all of the arithmetic operators, such as add/sub/mult/div, are
> implemented as old-style UDFs and java reflection is used to determine the
> return type TypeInfos/ObjectInspectors, based on the return type of the
> evaluate() method chosen for the expression. This works fine for types that
> don't have type params.
> Hive decimal type participates in these operations just like int or double.
> Different from double or int, however, decimal has precision and scale, which
> cannot be determined by just looking at the return type (decimal) of the UDF
> evaluate() method, even though the operands have certain precision/scale.
> With the default of "decimal" without precision/scale, then (10, 0) will be
> the type params. This is certainly not desirable.
> To solve this problem, all of the arithmetic operators would need to be
> implemented as GenericUDFs, which allow returning ObjectInspector during the
> initialize() method. The object inspectors returned can carry type params,
> from which the "exact" return type can be determined.
> It's worth mentioning that, for user UDF implemented in non-generic way, if
> the return type of the chosen evaluate() method is decimal, the return type
> actually has (10,0) as precision/scale, which might not be desirable. This
> needs to be documented.
> This JIRA will cover minus, plus, divide, multiply, mod, and pmod, to limit
> the scope of review. The remaining ones will be covered under HIVE-5706.
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)