Rishi Pandey created PIG-3907:
---------------------------------

             Summary: In-Built function COR does not work with any other 
numeric type other than double. 
                 Key: PIG-3907
                 URL: https://issues.apache.org/jira/browse/PIG-3907
             Project: Pig
          Issue Type: Bug
          Components: build, piggybank
    Affects Versions: 0.11.1
            Reporter: Rishi Pandey


Apache pig provides in-built function 'COR' (correlation). COR is used to 
calculate the correlation between various variables. 
COR function does not work if we provide any variable of datatype int or long.  
We need to explicitly cast the variables as double in the pig script. Which is 
never a good idea on the UI end. 

I have tried to unit test the correlation function by supplying some int values 
and it fails to iterate the bag. Same is the case, when supplying some int,long 
and double variables as input parameters to the COR function. However, my unit 
test for doubles gives the correct output. 
I have also tried to run the script on Hadoop Cluster, it fails if we have any 
variable other than double. 
It shows the following error on Hadoop cluster:    
ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2999: Unexpected internal 
error. null
or sometimes ERROR 1066: Unable to open iterator for alias  aliasName. Backend 
error : null

In the Java Code of COR function, it casts everything to double, which is 
correct.But in the computeAll(--,--) function, the cast on iterators to yield x 
and y does creates a problem. 

exact code : 
double x =(Double)iterator_x.next().get(0);  // error when int or long
double y =(Double)iterator_y.next().get(0); // error when int or long

Solutions: could be overriding  the method getArgToFuncMapping() and defining 
Various classes IntCOR, LongCOR,FloatCOR. As it is done for some other UDFs 
like VAR. 

Please, fix the issue in piggybank as well as in Built-in Library of Pig. 
I am using Apache pig 0.11 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to