Vivek Padmanabhan created PIG-2721:
--------------------------------------
Summary: Wrong output generated while loading bags as input
Key: PIG-2721
URL: https://issues.apache.org/jira/browse/PIG-2721
Project: Pig
Issue Type: Bug
Affects Versions: 0.9.2, 0.9.0
Reporter: Vivek Padmanabhan
{code}
A = LOAD '/user/pvivek/sample' as
(id:chararray,mybag:bag{tuple(bttype:chararray,cat:long)});
B = foreach A generate id,FLATTEN(mybag) AS (bttype, cat);
C = order B by id;
dump C;
{code}
The above code generates wrong results when executed with Pig 0.10 and Pig 0.9
The below is the sample input;
{code}
...LKGaHqg-- {(aa,806743)}
..0MI1Y37w-- {(aa,498970)}
..0bnlpJrw-- {(aa,806740)}
..0p0IIhbA-- {(aa,498971),(se,498995)}
..1VkGqvXA-- {(aa,805219)}
{code}
I think the Pig optimizers are causing this issue.From the logs I can see that
the $1 is pruned for the relation A.
[main] INFO org.apache.pig.newplan.logical.rules.ColumnPruneVisitor - Columns
pruned for A: $1
One workaround for this is to disable -t ColumnMapKeyPrune.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira