DuplicateForEachColumnRewrite makes assumptions about the position of 
LOGGenerate in the plan
---------------------------------------------------------------------------------------------

                 Key: PIG-2119
                 URL: https://issues.apache.org/jira/browse/PIG-2119
             Project: Pig
          Issue Type: Bug
            Reporter: Gianmarco De Francisci Morales


The input:
{code}
grunt> cat b.txt
a       11
b       3
c       10
a       12
b       10
c       15
{code}

The script:
{code}
a = load 'b.txt' AS (id:chararray, num:int);
b = group a by id;
c = foreach b { 
  d = order a by num DESC;
  n = COUNT(a);
  e = limit d 1;
  generate n;
}
{code}

The exception:
{code}
Caused by: java.lang.ClassCastException: 
org.apache.pig.newplan.logical.relational.LOLimit cannot be cast to 
org.apache.pig.newplan.logical.relational.LOGenerate
        at 
org.apache.pig.newplan.logical.rules.DuplicateForEachColumnRewrite$DuplicateForEachColumnRewriteTransformer.check(DuplicateForEachColumnRewrite.java:87)
        at 
org.apache.pig.newplan.optimizer.PlanOptimizer.optimize(PlanOptimizer.java:108)

{code}

I know the script is a bit pointless, but I was just testing and modifying the 
script bit by bit.
If I remove the limit in any case I get the same exception but with LOSort.

The problem, I think, is that the rule assumes there is only 1 sink in the 
nested block and that this sink is a LOGenerate.



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to