[ 
https://issues.apache.org/jira/browse/HIVE-1376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12874232#action_12874232
 ] 

John Sichi commented on HIVE-1376:
----------------------------------

Some more details on this:

* In the case of a full-table aggregation (no group by key) where no rows exist 
(or all get filtered out), the aggregation framework sends a row of all nulls 
to the aggregator.  I don't know why this is necessary, since all of the 
existing aggregators ignore the null anyway.

* Since the percentile UDAF uses a primitive double for the parameter type to 
the iterate method (rather than a Double or a DoubleWritable), Java reflection 
throws an IllegalArgumentException because it can't convert a null to a 
primitive.

There are three possible solutions:

(1) change percentile to use a non-primitive type

(2) add more reflection and skip the attempt to send the null to iterate in the 
case where the parameter type is primitive

(3) avoid sending the null in the first place (unless someone can explain why 
it's needed, or some regression test fails when we try it)


> Simple UDAFs with more than 1 parameter crash on empty row query 
> -----------------------------------------------------------------
>
>                 Key: HIVE-1376
>                 URL: https://issues.apache.org/jira/browse/HIVE-1376
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.6.0
>            Reporter: Mayank Lahiri
>             Fix For: 0.6.0
>
>
> Simple UDAFs with more than 1 parameter crash when the query returns no rows. 
> Currently, this only seems to affect the percentile() UDAF where the second 
> parameter is the percentile to be computed (of type double). I've also 
> verified the bug by adding a dummy parameter to ExampleMin in contrib. 
> On an empty query, Hive seems to be trying to resolve an iterate() method 
> with signature {null,null} instead of {null,double}. You can reproduce this 
> bug using:
> CREATE TABLE pct_test ( val INT );
> SELECT percentile(val, 0.5) FROM pct_test;
> which produces a lot of errors like: 
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to 
> execute method public boolean 
> org.apache.hadoop.hive.ql.udf.UDAFPercentile$PercentileLongEvaluator.iterate(org.apache.hadoop.io.LongWritable,double)
>   on object 
> org.apache.hadoop.hive.ql.udf.udafpercentile$percentilelongevalua...@11d13272 
> of class org.apache.hadoop.hive.ql.udf.UDAFPercentile$PercentileLongEvaluator 
> with arguments {null, null} of size 2

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to