[ 
https://issues.apache.org/jira/browse/PIG-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-2144:
-------------------------------

    Attachment: PIG-2144.08.1.patch

Patch passed unit tests and test-patch. Committed patch to trunk, 0.9. 
Committed slightly modified patch for 0.8 - PIG-2144.08.1.patch

> ClassCastException when using IsEmpty(DIFF()) 
> ----------------------------------------------
>
>                 Key: PIG-2144
>                 URL: https://issues.apache.org/jira/browse/PIG-2144
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.0, 0.8.1, 0.9.0
>            Reporter: Mitesh Singh Jat
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0, 0.8.1, 0.9.0
>
>         Attachments: PIG-2144.08.1.patch, PIG-2144.1.patch
>
>
> I have following input <name>:<nickname>, for which I want to find records 
> where name is different from nickname.
> {code:title=input/name_nickname.txt}
> Bharat:Bharat
> Amita:Amita
> Mitesh:Mitesh
> Reenu:Anshu
> Shikha:Shikhu
> Shilpa:Shilpi
> {code}
> I have following script to find records where name is different from nickname.
> {code:title=isEmpty_diff.pig}
> A = LOAD 'input/name_nickname.txt' using PigStorage(':');
> B = FILTER A BY NOT IsEmpty(DIFF($0, $1));
> DUMP B;
> {code}
> The above pig script works with older pig versions (e.g. 0.8.0 (r1043805)) 
> and gives following output
> {code:title=output of isEmpty_diff.pig}
> (Reenu,Anshu)
> (Shikha,Shikhu)
> (Shilpa,Shilpi)
> {code}
> However, the above pig script (isEmpty_diff.pig) fails on Pig 0.9 (e.g. 
> 0.9.0.xx (r1127671)) and newer version of Pig 0.8 (e.g. version 0.8.0.xx 
> (r1102885)) , with ClassCastException
> {code:title=ClassCastException}
> java.lang.ClassCastException: org.apache.pig.data.DefaultDataBag cannot be 
> cast to java.lang.Boolean
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.PONot.getNext(PONot.java:75)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:318)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.processInput(POUserFunc.java:159)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:184)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:269)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.PONot.getNext(PONot.java:71)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:148)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:261)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:256)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:58)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:676)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:336)
>         at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:210)
> {code}
> As a workaround, I used the following pig script.
> {code:titlee=isEmpty_diff2.pig}
> A = LOAD 'input/name_nickname.txt' using PigStorage(':');
> --B = FILTER A BY NOT IsEmpty(DIFF($0, $1));
> B1 = FOREACH A GENERATE $0, $1, DIFF($0, $1);
> B2 = FILTER B1 BY NOT IsEmpty($2);
> B = FOREACH B2 GENERATE $0, $1;
> DUMP B;
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to