New logical plan: Dereference does not add into plan after deepCopy
-------------------------------------------------------------------

                 Key: PIG-1729
                 URL: https://issues.apache.org/jira/browse/PIG-1729
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: 0.8.0
            Reporter: Daniel Dai
            Assignee: Daniel Dai
             Fix For: 0.8.0
         Attachments: PIG-1729-0.patch

The following script fail:
{code}
a = load '1.txt' as (a0:int, a1:int, a2:int);
b = load '2.txt' as (b0:int, b1:int);
c = cogroup a by a0, b by b0;
d = foreach c generate ((COUNT(a)==0L)?null : a.a0) as d0;
e = foreach d generate flatten(d0);
f = group e all;
explain f;
{code}

Error message:
ERROR 2000: Error processing rule GroupByConstParallelSetter. Try -t 
GroupByConstParallelSetter

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1067: Unable to 
explain alias f
        at org.apache.pig.PigServer.explain(PigServer.java:958)
        at 
org.apache.pig.tools.grunt.GruntParser.explainCurrentBatch(GruntParser.java:353)
        at 
org.apache.pig.tools.grunt.GruntParser.processExplain(GruntParser.java:285)
        at 
org.apache.pig.tools.grunt.GruntParser.processExplain(GruntParser.java:248)
        at 
org.apache.pig.tools.pigscript.parser.PigScriptParser.Explain(PigScriptParser.java:605)
        at 
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:327)
        at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
        at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
        at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:90)
        at org.apache.pig.Main.run(Main.java:498)
        at org.apache.pig.Main.main(Main.java:107)
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2042: 
Error in new logical plan. Try -Dpig.usenewlogicalplan=false.
        at 
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:309)
        at org.apache.pig.PigServer.compilePp(PigServer.java:1354)
        at org.apache.pig.PigServer.explain(PigServer.java:927)
        ... 10 more
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2000: 
Error processing rule GroupByConstParallelSetter. Try -t 
GroupByConstParallelSetter
        at 
org.apache.pig.newplan.optimizer.PlanOptimizer.optimize(PlanOptimizer.java:120)
        at 
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:277)
        ... 12 more
Caused by: java.lang.NullPointerException
        at 
org.apache.pig.newplan.logical.relational.LogicalSchema$LogicalFieldSchema.compatible(LogicalSchema.java:106)
        at 
org.apache.pig.newplan.logical.relational.LogicalSchema$LogicalFieldSchema.mergeUid(LogicalSchema.java:116)
        at 
org.apache.pig.newplan.logical.expression.ProjectExpression.getFieldSchema(ProjectExpression.java:153)
        at 
org.apache.pig.newplan.logical.optimizer.FieldSchemaResetter.execute(SchemaResetter.java:175)
        at 
org.apache.pig.newplan.logical.expression.AllSameExpressionVisitor.visit(AllSameExpressionVisitor.java:53)
        at 
org.apache.pig.newplan.logical.expression.ProjectExpression.accept(ProjectExpression.java:75)
        at 
org.apache.pig.newplan.ReverseDependencyOrderWalker.walk(ReverseDependencyOrderWalker.java:70)
        at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
        at 
org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:87)
        at 
org.apache.pig.newplan.logical.relational.LOGenerate.accept(LOGenerate.java:225)
        at 
org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
        at 
org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:76)
        at 
org.apache.pig.newplan.logical.relational.LOForEach.accept(LOForEach.java:71)
        at 
org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
        at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
        at 
org.apache.pig.newplan.logical.optimizer.SchemaPatcher.transformed(SchemaPatcher.java:43)
        at 
org.apache.pig.newplan.optimizer.PlanOptimizer.optimize(PlanOptimizer.java:112)
        ... 13 more

The reason is in MergeForEach rule, Pig does not add Dereference operator after 
deepCopy the expression plan of the second foreach. So either disable Column 
pruning (so we do not have extra foreach after cogroup), MergeForEach, 
GroupByConstParallelSetter (so we don't do a global schema regeneration) will 
suppress the error message. One minor issue is GroupByConstParallelSetter 
should not regenerate schema, since schema will not change after this rule.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to