[ 
https://issues.apache.org/jira/browse/PIG-2313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13119528#comment-13119528
 ] 

Doug Daniels commented on PIG-2313:
-----------------------------------

I created a patch that passes an IllustratorDummyReporter into the 
IlustratorContext in hadoop20 PigMapReduce.Reduce.  This fixes the NPE, but 
yields a new Exception about being unable to setup the store function:

{code}
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: 
Exception: Unable to setup the store function.
        at 
org.apache.pig.pen.LineageTrimmingVisitor.checkNewBaseData(LineageTrimmingVisitor.java:420)
        at 
org.apache.pig.pen.LineageTrimmingVisitor.processLoad(LineageTrimmingVisitor.java:374)
        at 
org.apache.pig.pen.LineageTrimmingVisitor.processOperator(LineageTrimmingVisitor.java:438)
        ... 19 more
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2081: 
Unable to setup the store function.
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.setUp(POStore.java:111)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.cleanup(PigGenericMapReduce.java:525)
        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:178)
        at 
org.apache.pig.pen.LocalMapReduceSimulator.launchPig(LocalMapReduceSimulator.java:222)
        at 
org.apache.pig.pen.ExampleGenerator.getData(ExampleGenerator.java:260)
        at 
org.apache.pig.pen.ExampleGenerator.getData(ExampleGenerator.java:278)
        at 
org.apache.pig.pen.LineageTrimmingVisitor.checkNewBaseData(LineageTrimmingVisitor.java:418)
        ... 21 more
Caused by: java.io.IOException: File already 
exists:file:/Users/ddaniels/code/sandbox/jira/output_new/_temporary/_attempt__0000_r_000000_0/part-r-00000
        at 
org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:228)
        at 
org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
        at 
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:484)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:465)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:372)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextOutputFormat.getRecordWriter(PigTextOutputFormat.java:98)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReducePOStoreImpl.createStoreFunc(MapReducePOStoreImpl.java:85)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.setUp(POStore.java:103)
        ... 27 more
================================================================================
{code}

Does anyone understand what would cause that?  The Exception above is from 
running it in local mode, but the same problem happens in MapReduce mode (I 
checked that the output dir does not exist too).

Also, I tried to add a test in TestExampleGenerator.java in the patch, but it 
complains "Internal error. Did not find roots in the physical plan."  Is there 
something special that needs to be done to get the store to appear in the plan?
                
> NPE in ILLUSTRATE trying to get StatusReporter in STORE
> -------------------------------------------------------
>
>                 Key: PIG-2313
>                 URL: https://issues.apache.org/jira/browse/PIG-2313
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.9.0, 0.10
>            Reporter: Doug Daniels
>         Attachments: PIG-2313-example.tar.gz, PIG-2313-first_shot.patch
>
>
> I'm seeing an NPE trying to do an illustrate on a script.  So far the 
> simplest version of the script that exhibits the issue is:
> {code}
> raw = LOAD 'data.txt' USING PigStorage() AS (x:int, y:int);
> filtered = FILTER raw BY x < 5;
> grouped = GROUP filtered BY x;
> counted = FOREACH grouped
>          GENERATE group AS x,
>                   COUNT(filtered) AS the_count;
> rmf output;
> STORE counted INTO 'output';
> {code}
> I had to pass a few nested Exceptions along to get it, but the bottom stack 
> trace looks like:
> {code}
> Caused by: java.lang.NullPointerException
>       at 
> org.apache.hadoop.mapreduce.TaskInputOutputContext.getCounter(TaskInputOutputContext.java:88)
>       at 
> org.apache.pig.tools.pigstats.PigStatusReporter.getCounter(PigStatusReporter.java:60)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReducePOStoreImpl.createRecordCounter(MapReducePOStoreImpl.java:121)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.setUp(POStore.java:108)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.cleanup(PigGenericMapReduce.java:525)
>       at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:178)
>       at 
> org.apache.pig.pen.LocalMapReduceSimulator.launchPig(LocalMapReduceSimulator.java:222)
>       at 
> org.apache.pig.pen.ExampleGenerator.getData(ExampleGenerator.java:257)
>       at 
> org.apache.pig.pen.ExampleGenerator.getData(ExampleGenerator.java:275)
>       at 
> org.apache.pig.pen.LineageTrimmingVisitor.checkNewBaseData(LineageTrimmingVisitor.java:418)
>       ... 21 more
> {code}
> It looks like the IllustratorContext in the hadoop20 PigMapReduce.java shim 
> (line 73) is getting setup with a null StatusReporter.  This seems to be for 
> the Reduce phase.  On the other hand, the PigMapBase.java sets up the 
> IllustratorContext with an IllustratorDummyReporter for the Map phase.
> Eventually when the code in MapReducePOStoreImpl line 121 tries to get the 
> reporter, it fails with the NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to