[ 
https://issues.apache.org/jira/browse/PIG-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13031767#comment-13031767
 ] 

Thomas Kappler commented on PIG-1683:
-------------------------------------

I found a strange problem that looks like a special case of this issue. 
Apologies if it isn't.

I wanted to use REGEX_EXTRACT in a nested generate block where I clean up some 
strings. Pig accepts or rejects the block depending on the order of the "is 
null" condition. The simplest example I could come up with that shows the 
problem is this:

{noformat} 
a = load '1.txt' using PigStorage(',') as (a0:chararray, a1:chararray);
b = foreach a {
    b0 = TRIM(a0);
    b1 = REGEX_EXTRACT(b0, '^\\((.+)\\)$', 1);
    generate ((b1 is null) ? b0 : b1) as cleaned_name; -- FAILS
    -- generate ((b1 is not null) ? b1 : b0) as cleaned_name; -- SUCCEEDS
    -- generate ((b1 is null) ? b0 : b1); -- FAILS
}
store b into 'out';
{noformat}

1.txt is

{noformat}
foo1,bar1
 (foo2),bar2
{noformat}

The "b is null" variant fails with the original error message of this issue: 
"Attempt to give operator of type 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject
 multiple outputs. This operator does not support multiple outputs."

The inverted, logically equivalent "b is not null" variant succeeds.

If I replace the REGEX_EXTRACT call with a simple expression like "b1 = a0", it 
works. But the way I read the Pig Latin reference, it should be allowed at this 
point since it's not a relational operator?

> New logical plan: Nested foreach plan fail if one inner alias is refered more 
> than once
> ---------------------------------------------------------------------------------------
>
>                 Key: PIG-1683
>                 URL: https://issues.apache.org/jira/browse/PIG-1683
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.8.0
>
>         Attachments: PIG-1683-1.patch
>
>
> The following script fail:
> {code}
> a = load '1.txt' as (a0, a1, a2);
> b = load '2.txt' as (b0, b1);
> c = join a by a0, b by b0;
> d = foreach c {
>     d0 = a::a0;
>     d1 = a::a1;
>     generate ((d0 is not null)? d0 : d1);
> }
> explain d;
> {code}
> Stack:
> ERROR 2015: Invalid physical operators in the physical plan
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1067: Unable to 
> explain alias d
>         at org.apache.pig.PigServer.explain(PigServer.java:957)
>         at 
> org.apache.pig.tools.grunt.GruntParser.explainCurrentBatch(GruntParser.java:353)
>         at 
> org.apache.pig.tools.grunt.GruntParser.processExplain(GruntParser.java:285)
>         at 
> org.apache.pig.tools.grunt.GruntParser.processExplain(GruntParser.java:248)
>         at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.Explain(PigScriptParser.java:605)
>         at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:327)
>         at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
>         at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
>         at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:90)
>         at org.apache.pig.Main.run(Main.java:498)
>         at org.apache.pig.Main.main(Main.java:107)
> Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2042: 
> Error in new logical plan. Try -Dpig.usenewlogicalplan=false.
>         at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:308)
>         at org.apache.pig.PigServer.compilePp(PigServer.java:1350)
>         at org.apache.pig.PigServer.explain(PigServer.java:926)
>         ... 10 more
> Caused by: 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogicalToPhysicalTranslatorException:
>  ERROR 2015: Invalid physical operators in the physical plan
>         at 
> org.apache.pig.newplan.logical.expression.ExpToPhyTranslationVisitor.visit(ExpToPhyTranslationVisitor.java:474)
>         at 
> org.apache.pig.newplan.logical.expression.BinCondExpression.accept(BinCondExpression.java:82)
>         at 
> org.apache.pig.newplan.ReverseDependencyOrderWalker.walk(ReverseDependencyOrderWalker.java:70)
>         at 
> org.apache.pig.newplan.logical.relational.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:519)
>         at 
> org.apache.pig.newplan.logical.relational.LOForEach.accept(LOForEach.java:71)
>         at 
> org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
>         at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
>         at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:295)
>         ... 12 more
> Caused by: org.apache.pig.impl.plan.PlanException: ERROR 0: Attempt to give 
> operator of type 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject
>  multiple outputs.  This operator does not support multiple outputs.
>         at 
> org.apache.pig.impl.plan.OperatorPlan.connect(OperatorPlan.java:180)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.plans.PhysicalPlan.connect(PhysicalPlan.java:133)
>         at 
> org.apache.pig.newplan.logical.expression.ExpToPhyTranslationVisitor.visit(ExpToPhyTranslationVisitor.java:470)
>         ... 19 more

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to