[
https://issues.apache.org/jira/browse/PIG-3782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13914774#comment-13914774
]
Koji Noguchi commented on PIG-3782:
-----------------------------------
PushDownForEachFlatten is properly copying the UserDefinedSchema from the
original LOGenerate to the new LOGenerate (that is added after the join).
{code:title=PushDownForEachFlatten.java|borderStyle=solid}
295 if (mUserDefinedSchema!=null)
296 gen.setUserDefinedSchema(mUserDefinedSchema);
{code}
I'm thinking the problem is rather on LOGenerate.mUserDefinedSchema such that
they never contain an actual UID since all the updates are done on the copy.
{code:title=LOGenerate.java|borderStyle=solid}
76 LogicalSchema mUserDefinedSchemaCopy = null;
77 if (mUserDefinedSchema!=null &&
mUserDefinedSchema.get(i)!=null) {
78 mUserDefinedSchemaCopy = new LogicalSchema();
79 for (LogicalSchema.LogicalFieldSchema fs :
mUserDefinedSchema.get(i).getFields()) {
80 mUserDefinedSchemaCopy.addField(fs.deepCopy());
81 }
82 }
...
143 for (LogicalFieldSchema fs :
mUserDefinedSchemaCopy.getFields()) {
144 fs.stampFieldSchema(); //new UID is assigned
{code}
> PushDownForEachFlatten + ColumnMapKeyPrune with user defined schema failing
> due to incorrect UID assignment
> -----------------------------------------------------------------------------------------------------------
>
> Key: PIG-3782
> URL: https://issues.apache.org/jira/browse/PIG-3782
> Project: Pig
> Issue Type: Bug
> Reporter: Koji Noguchi
> Assignee: Koji Noguchi
>
> {noformat}
> a = load '1.txt' as (a0:int, a1, a2:bag{});
> b = load '2.txt' as (b0:int, b1);
> c = foreach a generate a0, flatten(a2) as (q1, q2);
> d = join c by a0, b by b0;
> e = foreach d generate a0, q1, q2;
> f = foreach e generate a0, (int)q1, (int)q2;
> store f into 'output';
> {noformat}
> This pig script fails with
> 2014-02-27 11:49:45,657 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR
> 2229: Couldn't find matching uid -1 for project (Name: Project Type:
> bytearray Uid: 13 Input: 0 Column: 1)
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)