[
https://issues.apache.org/jira/browse/PIG-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Daniel Dai updated PIG-1729:
----------------------------
Attachment: PIG-1729-0.patch
PIG-1729-0.patch is a preliminary fix.
> New logical plan: Dereference does not add into plan after deepCopy
> -------------------------------------------------------------------
>
> Key: PIG-1729
> URL: https://issues.apache.org/jira/browse/PIG-1729
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.8.0
> Reporter: Daniel Dai
> Assignee: Daniel Dai
> Fix For: 0.8.0
>
> Attachments: PIG-1729-0.patch
>
>
> The following script fail:
> {code}
> a = load '1.txt' as (a0:int, a1:int, a2:int);
> b = load '2.txt' as (b0:int, b1:int);
> c = cogroup a by a0, b by b0;
> d = foreach c generate ((COUNT(a)==0L)?null : a.a0) as d0;
> e = foreach d generate flatten(d0);
> f = group e all;
> explain f;
> {code}
> Error message:
> ERROR 2000: Error processing rule GroupByConstParallelSetter. Try -t
> GroupByConstParallelSetter
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1067: Unable to
> explain alias f
> at org.apache.pig.PigServer.explain(PigServer.java:958)
> at
> org.apache.pig.tools.grunt.GruntParser.explainCurrentBatch(GruntParser.java:353)
> at
> org.apache.pig.tools.grunt.GruntParser.processExplain(GruntParser.java:285)
> at
> org.apache.pig.tools.grunt.GruntParser.processExplain(GruntParser.java:248)
> at
> org.apache.pig.tools.pigscript.parser.PigScriptParser.Explain(PigScriptParser.java:605)
> at
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:327)
> at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
> at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:90)
> at org.apache.pig.Main.run(Main.java:498)
> at org.apache.pig.Main.main(Main.java:107)
> Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2042:
> Error in new logical plan. Try -Dpig.usenewlogicalplan=false.
> at
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:309)
> at org.apache.pig.PigServer.compilePp(PigServer.java:1354)
> at org.apache.pig.PigServer.explain(PigServer.java:927)
> ... 10 more
> Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2000:
> Error processing rule GroupByConstParallelSetter. Try -t
> GroupByConstParallelSetter
> at
> org.apache.pig.newplan.optimizer.PlanOptimizer.optimize(PlanOptimizer.java:120)
> at
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:277)
> ... 12 more
> Caused by: java.lang.NullPointerException
> at
> org.apache.pig.newplan.logical.relational.LogicalSchema$LogicalFieldSchema.compatible(LogicalSchema.java:106)
> at
> org.apache.pig.newplan.logical.relational.LogicalSchema$LogicalFieldSchema.mergeUid(LogicalSchema.java:116)
> at
> org.apache.pig.newplan.logical.expression.ProjectExpression.getFieldSchema(ProjectExpression.java:153)
> at
> org.apache.pig.newplan.logical.optimizer.FieldSchemaResetter.execute(SchemaResetter.java:175)
> at
> org.apache.pig.newplan.logical.expression.AllSameExpressionVisitor.visit(AllSameExpressionVisitor.java:53)
> at
> org.apache.pig.newplan.logical.expression.ProjectExpression.accept(ProjectExpression.java:75)
> at
> org.apache.pig.newplan.ReverseDependencyOrderWalker.walk(ReverseDependencyOrderWalker.java:70)
> at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
> at
> org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:87)
> at
> org.apache.pig.newplan.logical.relational.LOGenerate.accept(LOGenerate.java:225)
> at
> org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
> at
> org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:76)
> at
> org.apache.pig.newplan.logical.relational.LOForEach.accept(LOForEach.java:71)
> at
> org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
> at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
> at
> org.apache.pig.newplan.logical.optimizer.SchemaPatcher.transformed(SchemaPatcher.java:43)
> at
> org.apache.pig.newplan.optimizer.PlanOptimizer.optimize(PlanOptimizer.java:112)
> ... 13 more
> The reason is in MergeForEach rule, Pig does not add Dereference operator
> after deepCopy the expression plan of the second foreach. So either disable
> Column pruning (so we do not have extra foreach after cogroup), MergeForEach,
> GroupByConstParallelSetter (so we don't do a global schema regeneration) will
> suppress the error message. One minor issue is GroupByConstParallelSetter
> should not regenerate schema, since schema will not change after this rule.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.