I made a Jira. https://issues.apache.org/jira/browse/PIG-2132
Should be pretty easy to fix. I'll probably do so over the weekend if nobody else gets to it first. 2011/6/17 Alan Gates <ga...@yahoo-inc.com> > MAX should definitely handle null, and it should ignore it. The goal for > our SQL like built in aggregate functions (MIN, MAX, COUNT, SUM, AVG) is to > be SQL like. SQL ignores nulls in these functions. It's inconsistent, but > it's usually what people. So, we should be consistently inconsistent like > SQL. :) > > Alan. > > > On Jun 16, 2011, at 1:07 PM, Daniel Dai wrote: > > I take back this after I saw Dmitriy's reply. Seems to be it is not that >> straightforward. >> >> Daniel >> >> On 06/16/2011 01:00 PM, Daniel Dai wrote: >> >>> Yes, I think it is better if MAX can handle NULL. Can you open a Jira? >>> >>> Daniel >>> >>> On 06/16/2011 12:16 PM, Jonathan Coveney wrote: >>> >>>> Do we want the Max function to be able to handle nulls? Seems fairly >>>> natural >>>> for it to be able to. >>>> >>>> 2011/6/16 Daniel Dai<jiany...@yahoo-inc.com> >>>> >>>> Jonathan is right. math.MAX does not handle null input. Check for null >>>>> before feeding into MAX is necessary. >>>>> >>>>> Daniel >>>>> >>>>> >>>>> On 06/16/2011 06:45 AM, Jonathan Coveney wrote: >>>>> >>>>> Can you check if your rank2 or rank3 values are ever null? If they >>>>>> are, >>>>>> there are some ad hoc fixes which you can do until this is fixed (and >>>>>> it >>>>>> is >>>>>> easy to fix, just a question of deciding what the desired handling of >>>>>> null >>>>>> values should be). I would just do something like... >>>>>> >>>>>> A = LOAD 'FF23_Filtered.txt' AS (appID, rank2, rank3); >>>>>> B = FILTER A BY rank2 is null AND rank3 is null; >>>>>> C = FOREACH A GENERATE appID, ( rank2 is null ? rank3 : rank2) as >>>>>> rank2, ( >>>>>> rank3 is null ? rank2 : rank3 ) as rank3; >>>>>> >>>>>> Obvoiusly you could tweak that for whatever you want to happen if a >>>>>> value >>>>>> is >>>>>> null. >>>>>> >>>>>> 2011/6/16 Jonathan Coveney<jcove...@gmail.com> >>>>>> >>>>>> Hm, just to make sure, I ran this against trunk (to see if it's just >>>>>> a >>>>>> >>>>>>> 0.7.0 thing or not). >>>>>>> >>>>>>> A = LOAD 'test.txt'; --this is just a blank one line file >>>>>>> B = FOREACH A GENERATE >>>>>>> org.apache.pig.piggybank.****evaluation.math.MAX(1,null); >>>>>>> >>>>>>> I also tested fedding it files from test.txt etc. It fails when there >>>>>>> is >>>>>>> a >>>>>>> null value. The cast does not. >>>>>>> >>>>>>> 2011/6/16 Lakshminarayana >>>>>>> Motamarri<narayana.gupta123@****gmail.com<http://gmail.com> >>>>>>> <narayana.gupta123@**gmail.com <narayana.gupta...@gmail.com>> >>>>>>> Hi all, >>>>>>> >>>>>>>> *I am receiving the following exception:* >>>>>>>> org.apache.pig.backend.****executionengine.ExecException: ERROR >>>>>>>> 2078: >>>>>>>> Caught >>>>>>>> error from UDF: org.apache.pig.piggybank.**** >>>>>>>> evaluation.math.DoubleMax >>>>>>>> [Caught >>>>>>>> exception processing input row [null]] >>>>>>>> at >>>>>>>> >>>>>>>> org.apache.pig.backend.hadoop.****executionengine.** >>>>>>>> physicalLayer.** >>>>>>>> expressionOperators.****POUserFunc.getNext(POUserFunc.****java:229) >>>>>>>> at >>>>>>>> >>>>>>>> org.apache.pig.backend.hadoop.****executionengine.** >>>>>>>> physicalLayer.** >>>>>>>> expressionOperators.****POUserFunc.getNext(POUserFunc.****java:263) >>>>>>>> at >>>>>>>> >>>>>>>> org.apache.pig.backend.hadoop.****executionengine.** >>>>>>>> physicalLayer.** >>>>>>>> relationalOperators.POForEach.****processPlan(POForEach.java:*** >>>>>>>> *269) >>>>>>>> at >>>>>>>> >>>>>>>> org.apache.pig.backend.hadoop.****executionengine.** >>>>>>>> physicalLayer.** >>>>>>>> relationalOperators.POForEach.****getNext(POForEach.java:204) >>>>>>>> at >>>>>>>> >>>>>>>> org.apache.pig.backend.hadoop.****executionengine.** >>>>>>>> mapReduceLayer.PigMapBase.****runPipeline(PigMapBase.java:****249) >>>>>>>> at >>>>>>>> >>>>>>>> org.apache.pig.backend.hadoop.****executionengine.** >>>>>>>> mapReduceLayer.PigMapBase.map(****PigMapBase.java:240) >>>>>>>> at >>>>>>>> >>>>>>>> org.apache.pig.backend.hadoop.****executionengine.** >>>>>>>> mapReduceLayer.PigMapOnly$Map.****map(PigMapOnly.java:65) >>>>>>>> at org.apache.hadoop.mapred.****MapRunner.run(MapRunner.java:*** >>>>>>>> *50) >>>>>>>> at org.apache.hadoop.mapred.****MapTask.runOldMapper(MapTask.*** >>>>>>>> * >>>>>>>> java:358) >>>>>>>> at org.apache.hadoop.mapred.****MapTask.run(MapTask.java:307) >>>>>>>> at org.apache.hadoop.mapred.****Child.main(Child.java:170) >>>>>>>> Caused by: java.io.IOException: Caught exception processing input >>>>>>>> row >>>>>>>> [null] >>>>>>>> at >>>>>>>> org.apache.pig.piggybank.****evaluation.math.DoubleMax.** >>>>>>>> exec(DoubleMax.java:70) >>>>>>>> at >>>>>>>> org.apache.pig.piggybank.****evaluation.math.DoubleMax.** >>>>>>>> exec(DoubleMax.java:57) >>>>>>>> at >>>>>>>> >>>>>>>> org.apache.pig.backend.hadoop.****executionengine.** >>>>>>>> physicalLayer.** >>>>>>>> expressionOperators.****POUserFunc.getNext(POUserFunc.****java:201) >>>>>>>> ... 10 more >>>>>>>> Caused by: java.lang.NullPointerException >>>>>>>> ... 13 more >>>>>>>> >>>>>>>> *My Code:* >>>>>>>> *FFW2 = Load 'final_free_w2.txt'; >>>>>>>> FFW3 = Load 'final_free_w3.txt'; >>>>>>>> FFW2_RankG_RankCate = FOREACH FFW2 GENERATE $0, $4, $3; >>>>>>>> FFW3_RankG_RankCate = FOREACH FFW3 GENERATE $0, $4, $3; >>>>>>>> FF23 = JOIN FFW2_RankG_RankCate BY $0, FFW3_RankG_RankCate BY $0; >>>>>>>> FF23_Filtered = Foreach FF23 Generate $0,$2,$5; >>>>>>>> STORE FF23_Filtered INTO 'FF23_Filtered.txt'; >>>>>>>> >>>>>>>> REGISTER >>>>>>>> /home/training/Desktop/1pig/****pig-0.7.0/contrib/piggybank/** >>>>>>>> piggybank.jar >>>>>>>> A = LOAD 'FF23_Filtered.txt' AS (appID, rank2, rank3); >>>>>>>> B = FOREACH A GENERATE appID, >>>>>>>> org.apache.pig.piggybank.****evaluation.math.MAX((double)****rank2, >>>>>>>> (double)rank3); >>>>>>>> store B into 'FF23_FJM.txt'; * >>>>>>>> >>>>>>>> >>>>>>>> --> Can any one pls let me know, what is the exact reason which >>>>>>>> is >>>>>>>> causing >>>>>>>> above exception... >>>>>>>> I also made sure that, the file* FF23_Filtered.txt* is not NULL. >>>>>>>> >>>>>>>> --- >>>>>>>> Thanks& Regards, >>>>>>>> Narayan. >>>>>>>> >>>>>>>> >>>>>>>> >> >