[ https://issues.apache.org/jira/browse/PIG-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Thejas M Nair updated PIG-2144: ------------------------------- Status: Patch Available (was: Open) > ClassCastException when using IsEmpty(DIFF()) > ---------------------------------------------- > > Key: PIG-2144 > URL: https://issues.apache.org/jira/browse/PIG-2144 > Project: Pig > Issue Type: Bug > Affects Versions: 0.8.1, 0.8.0, 0.9.0 > Reporter: Mitesh Singh Jat > Assignee: Thejas M Nair > Fix For: 0.9.0, 0.8.1, 0.8.0 > > Attachments: PIG-2144.1.patch > > > I have following input <name>:<nickname>, for which I want to find records > where name is different from nickname. > {code:title=input/name_nickname.txt} > Bharat:Bharat > Amita:Amita > Mitesh:Mitesh > Reenu:Anshu > Shikha:Shikhu > Shilpa:Shilpi > {code} > I have following script to find records where name is different from nickname. > {code:title=isEmpty_diff.pig} > A = LOAD 'input/name_nickname.txt' using PigStorage(':'); > B = FILTER A BY NOT IsEmpty(DIFF($0, $1)); > DUMP B; > {code} > The above pig script works with older pig versions (e.g. 0.8.0 (r1043805)) > and gives following output > {code:title=output of isEmpty_diff.pig} > (Reenu,Anshu) > (Shikha,Shikhu) > (Shilpa,Shilpi) > {code} > However, the above pig script (isEmpty_diff.pig) fails on Pig 0.9 (e.g. > 0.9.0.xx (r1127671)) and newer version of Pig 0.8 (e.g. version 0.8.0.xx > (r1102885)) , with ClassCastException > {code:title=ClassCastException} > java.lang.ClassCastException: org.apache.pig.data.DefaultDataBag cannot be > cast to java.lang.Boolean > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.PONot.getNext(PONot.java:75) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:318) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.processInput(POUserFunc.java:159) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:184) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:269) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.PONot.getNext(PONot.java:71) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:148) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:261) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:256) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:58) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:676) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:336) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:210) > {code} > As a workaround, I used the following pig script. > {code:titlee=isEmpty_diff2.pig} > A = LOAD 'input/name_nickname.txt' using PigStorage(':'); > --B = FILTER A BY NOT IsEmpty(DIFF($0, $1)); > B1 = FOREACH A GENERATE $0, $1, DIFF($0, $1); > B2 = FILTER B1 BY NOT IsEmpty($2); > B = FOREACH B2 GENERATE $0, $1; > DUMP B; > {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira