A simple pig script to get the distinct elements of a gouped bag. data = LOAD .............; data_pro = FOREACH data GENERATE custid, ip, userid; data_group = GROUP data_pro BY custid; data_group_pro = FOREACH data_group GENERATE group, Distinct(data_pro.userid), Distinct(data_pro.ip);
But there's error: java.lang.ClassCastException: org.apache.pig.data.SingleTupleBag cannot be cast to org.apache.pig.data.Tuple at org.apache.pig.builtin.Distinct$Initial.exec(Distinct.java:86) at org.apache.pig.builtin.Distinct$Initial.exec(Distinct.java:75) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:337) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:376) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:354) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:372) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:297) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:308) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:263) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:283) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) at org.apache.hadoop.mapred.Child$4.run(Child.java:268) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.mapred.Child.main(Child.java:262) I am using pig 0.11 which is packaged in chd4.3.2, where i can see the source code: http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.pig/pig/0.11.0-cdh4.3.2/org/apache/pig/builtin/Distinct.java Any insight on this?Thanks,Lei leiwang...@gmail.com