[
https://issues.apache.org/jira/browse/PIG-508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pradeep Kamath updated PIG-508:
-------------------------------
Attachment: PIG-508.patch
Attached patch - details on the fix:
POPackageAnnotator is a visitor which looks for "POPackage" and annotates it
with "keyInfo" from each of the LocalRearranges which provide input to the
POPackage. The keyinfo essentially has information about what part of the
"value" for a given input is present in the "key" and hence ommitted from the
"value". The visitor was incorrectly assuming that if a local rearrange
corresponding to the package is found in the given MROper's map plan, then the
annotation is done. This breaks in the case of the script in this issue - the
POPackage has one of its Local rearranges in the map plan of the same MROper as
the POPackage and the other local rearrange in the reduce plan of the
predecessor MROper. Hence the visitor was changed to ensure that POPackage is
annotated with information from *all* Local rearranges.
> Query with a cogroup have one of its inputs coming from a group fails
> ---------------------------------------------------------------------
>
> Key: PIG-508
> URL: https://issues.apache.org/jira/browse/PIG-508
> Project: Pig
> Issue Type: Bug
> Affects Versions: types_branch
> Reporter: Pradeep Kamath
> Assignee: Pradeep Kamath
> Fix For: types_branch
>
> Attachments: PIG-508.patch
>
>
> Script which fails:
> {code}
> a = load '/user/pig/tests/data/singlefile/studenttab10k';
> b = group a by $0;
> c = load '/user/pig/tests/data/singlefile/studenttab10k';
> d = cogroup b by $0, c by $0;
> e = foreach d generate group, c.$1, SUM(c.$1), COUNT(c);
> dump e;
> {code}
> Error message produced:
> {noformat}
> 08/10/23 15:23:54 ERROR mapReduceLayer.MapReduceLauncher: Job failed!
> 08/10/23 15:23:54 ERROR mapReduceLayer.Launcher: Error message from task
> (reduce) task_200810231521_0007_r_000000java.lang.NullPointerException
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackage.getNext(POPackage.java:218)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:208)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:134)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
> at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
> {noformat}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.