[ 
https://issues.apache.org/jira/browse/PIG-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1729:
----------------------------

    Attachment: PIG-1729-0.patch

PIG-1729-0.patch is a preliminary fix.

> New logical plan: Dereference does not add into plan after deepCopy
> -------------------------------------------------------------------
>
>                 Key: PIG-1729
>                 URL: https://issues.apache.org/jira/browse/PIG-1729
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.8.0
>
>         Attachments: PIG-1729-0.patch
>
>
> The following script fail:
> {code}
> a = load '1.txt' as (a0:int, a1:int, a2:int);
> b = load '2.txt' as (b0:int, b1:int);
> c = cogroup a by a0, b by b0;
> d = foreach c generate ((COUNT(a)==0L)?null : a.a0) as d0;
> e = foreach d generate flatten(d0);
> f = group e all;
> explain f;
> {code}
> Error message:
> ERROR 2000: Error processing rule GroupByConstParallelSetter. Try -t 
> GroupByConstParallelSetter
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1067: Unable to 
> explain alias f
>         at org.apache.pig.PigServer.explain(PigServer.java:958)
>         at 
> org.apache.pig.tools.grunt.GruntParser.explainCurrentBatch(GruntParser.java:353)
>         at 
> org.apache.pig.tools.grunt.GruntParser.processExplain(GruntParser.java:285)
>         at 
> org.apache.pig.tools.grunt.GruntParser.processExplain(GruntParser.java:248)
>         at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.Explain(PigScriptParser.java:605)
>         at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:327)
>         at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
>         at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
>         at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:90)
>         at org.apache.pig.Main.run(Main.java:498)
>         at org.apache.pig.Main.main(Main.java:107)
> Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2042: 
> Error in new logical plan. Try -Dpig.usenewlogicalplan=false.
>         at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:309)
>         at org.apache.pig.PigServer.compilePp(PigServer.java:1354)
>         at org.apache.pig.PigServer.explain(PigServer.java:927)
>         ... 10 more
> Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2000: 
> Error processing rule GroupByConstParallelSetter. Try -t 
> GroupByConstParallelSetter
>         at 
> org.apache.pig.newplan.optimizer.PlanOptimizer.optimize(PlanOptimizer.java:120)
>         at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:277)
>         ... 12 more
> Caused by: java.lang.NullPointerException
>         at 
> org.apache.pig.newplan.logical.relational.LogicalSchema$LogicalFieldSchema.compatible(LogicalSchema.java:106)
>         at 
> org.apache.pig.newplan.logical.relational.LogicalSchema$LogicalFieldSchema.mergeUid(LogicalSchema.java:116)
>         at 
> org.apache.pig.newplan.logical.expression.ProjectExpression.getFieldSchema(ProjectExpression.java:153)
>         at 
> org.apache.pig.newplan.logical.optimizer.FieldSchemaResetter.execute(SchemaResetter.java:175)
>         at 
> org.apache.pig.newplan.logical.expression.AllSameExpressionVisitor.visit(AllSameExpressionVisitor.java:53)
>         at 
> org.apache.pig.newplan.logical.expression.ProjectExpression.accept(ProjectExpression.java:75)
>         at 
> org.apache.pig.newplan.ReverseDependencyOrderWalker.walk(ReverseDependencyOrderWalker.java:70)
>         at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
>         at 
> org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:87)
>         at 
> org.apache.pig.newplan.logical.relational.LOGenerate.accept(LOGenerate.java:225)
>         at 
> org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
>         at 
> org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:76)
>         at 
> org.apache.pig.newplan.logical.relational.LOForEach.accept(LOForEach.java:71)
>         at 
> org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
>         at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
>         at 
> org.apache.pig.newplan.logical.optimizer.SchemaPatcher.transformed(SchemaPatcher.java:43)
>         at 
> org.apache.pig.newplan.optimizer.PlanOptimizer.optimize(PlanOptimizer.java:112)
>         ... 13 more
> The reason is in MergeForEach rule, Pig does not add Dereference operator 
> after deepCopy the expression plan of the second foreach. So either disable 
> Column pruning (so we do not have extra foreach after cogroup), MergeForEach, 
> GroupByConstParallelSetter (so we don't do a global schema regeneration) will 
> suppress the error message. One minor issue is GroupByConstParallelSetter 
> should not regenerate schema, since schema will not change after this rule.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to