[ 
https://issues.apache.org/jira/browse/DATAFU-68?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14151117#comment-14151117
 ] 

Jarek Jarcec Cecho commented on DATAFU-68:
------------------------------------------

Thank you for the review [~matterhayes]!

> SampleByKey can throw NullPointerException
> ------------------------------------------
>
>                 Key: DATAFU-68
>                 URL: https://issues.apache.org/jira/browse/DATAFU-68
>             Project: DataFu
>          Issue Type: Bug
>            Reporter: Jarek Jarcec Cecho
>            Assignee: Jarek Jarcec Cecho
>             Fix For: 1.3.0
>
>         Attachments: DATAFU-68.patch, DATAFU-68.patch
>
>
> I've noticed that {{SampleByKey}} can throw {{NullPointerException}}:
> {code}
> Caused by: java.lang.NullPointerException
>       at 
> datafu.pig.sampling.SampleByKey.setUDFContextSignature(SampleByKey.java:86)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.setSignature(POUserFunc.java:604)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.instantiateFunc(POUserFunc.java:127)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.<init>(POUserFunc.java:122)
>       at 
> org.apache.pig.newplan.logical.expression.ExpToPhyTranslationVisitor.visit(ExpToPhyTranslationVisitor.java:505)
>       at 
> org.apache.pig.newplan.logical.expression.UserFuncExpression.accept(UserFuncExpression.java:112)
>       at 
> org.apache.pig.newplan.ReverseDependencyOrderWalkerWOSeenChk.walk(ReverseDependencyOrderWalkerWOSeenChk.java:69)
>       at 
> org.apache.pig.newplan.logical.relational.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:220)
>       at 
> org.apache.pig.newplan.logical.relational.LOFilter.accept(LOFilter.java:79)
>       at 
> org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
>       at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
>       at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:310)
>       at org.apache.pig.PigServer.compilePp(PigServer.java:1380)
>       at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1305)
>       at org.apache.pig.PigServer.storeEx(PigServer.java:978)
>       at org.apache.pig.PigServer.store(PigServer.java:942)
>       at org.apache.pig.Pig
> {code}
> I've reproduced the behaviour on old 1.1.0 version, but the UDF in question 
> did not change much since then and hence I'm assuming that trunk will be 
> affected the same way. Script that reproduces the issue is simple:
> {code}
> grunt> DEFINE SampleByKey datafu.pig.sampling.SampleByKey('0.5'); 
> grunt> data = LOAD 'datafu/input_datafu' AS (A_id:chararray, B_id:chararray, 
> C:int);
> grunt> out = FILTER data BY SampleByKey(A_id); 
> grunt> DUMP out;
> {code}
> The problem seems to be that method {{setUDFContextSignature}} can be called 
> with {{null}} argument that breaks our code. The documentation for this 
> method is not specific whether {{null}} is or isn't allowed. I've looked into 
> other UDFs in Pig and it seems that they are handling the case when signature 
> is {{null}} and hence I've decided to fix {{SampleByKey}} as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to