[ 
https://issues.apache.org/jira/browse/HIVE-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13843479#comment-13843479
 ] 

Eric Hanson commented on HIVE-5356:
-----------------------------------

Before this patch was committed, integer-integer division vectorized. Now it 
does not. This is a performance regression and also a functional regression for 
"EXPLAIN". This may have been caught by the vectorization tests (see test 
output in comment above on about 3 Nov), but maybe it was not clear to the 
developers of this patch because vectorization is pretty new. If a 
vectorization test .q.out file contains in EXPLAIN output the string 
"Vectorized execution: true" then the plan vectorizes. It is important that 
future patches not regress this behavior for performance reasons. I would like 
to see any regressions to vectorization be fixed before patches are applied, 
ideally, or else have some discussion and consensus.

> Move arithmatic UDFs to generic UDF implementations
> ---------------------------------------------------
>
>                 Key: HIVE-5356
>                 URL: https://issues.apache.org/jira/browse/HIVE-5356
>             Project: Hive
>          Issue Type: Task
>          Components: UDF
>    Affects Versions: 0.11.0
>            Reporter: Xuefu Zhang
>            Assignee: Xuefu Zhang
>             Fix For: 0.13.0
>
>         Attachments: HIVE-5356.1.patch, HIVE-5356.10.patch, 
> HIVE-5356.11.patch, HIVE-5356.12.patch, HIVE-5356.2.patch, HIVE-5356.3.patch, 
> HIVE-5356.4.patch, HIVE-5356.5.patch, HIVE-5356.6.patch, HIVE-5356.7.patch, 
> HIVE-5356.8.patch, HIVE-5356.9.patch
>
>
> Currently, all of the arithmetic operators, such as add/sub/mult/div, are 
> implemented as old-style UDFs and java reflection is used to determine the 
> return type TypeInfos/ObjectInspectors, based on the return type of the 
> evaluate() method chosen for the expression. This works fine for types that 
> don't have type params.
> Hive decimal type participates in these operations just like int or double. 
> Different from double or int, however, decimal has precision and scale, which 
> cannot be determined by just looking at the return type (decimal) of the UDF 
> evaluate() method, even though the operands have certain precision/scale. 
> With the default of "decimal" without precision/scale, then (10, 0) will be 
> the type params. This is certainly not desirable.
> To solve this problem, all of the arithmetic operators would need to be 
> implemented as GenericUDFs, which allow returning ObjectInspector during the 
> initialize() method. The object inspectors returned can carry type params, 
> from which the "exact" return type can be determined.
> It's worth mentioning that, for user UDF implemented in non-generic way, if 
> the return type of the chosen evaluate() method is decimal, the return type 
> actually has (10,0) as precision/scale, which might not be desirable. This 
> needs to be documented.
> This JIRA will cover minus, plus, divide, multiply, mod, and pmod, to limit 
> the scope of review. The remaining ones will be covered under HIVE-5706.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to