Koji Noguchi created PIG-5452: --------------------------------- Summary: Null handling of FLATTEN with user defined schema (as clause) Key: PIG-5452 URL: https://issues.apache.org/jira/browse/PIG-5452 Project: Pig Issue Type: Bug Reporter: Koji Noguchi Assignee: Koji Noguchi
Follow up from PIG-5201, {code:java} A = load 'input' as (a1:chararray); B = FOREACH A GENERATE a1, null as a2:tuple(A1:chararray, A2:chararray), a1 as a3; C = FOREACH B GENERATE a1, FLATTEN(a2), a3; dump C;{code} This produces right number of nulls. {code:java} (a,,,a) (b,,,b) (c,,,c) (d,,,d) (f,,,f) {code} However, {code:java} A = load 'input.txt' as (a1:chararray); B = FOREACH A GENERATE a1, null as a2:tuple(), a1 as a3; C = FOREACH B GENERATE a1, FLATTEN(a2) as (A1:chararray, A2:chararray), a3; dump C;{code} This produces wrong number of null and the output is shifted incorrectly. {code:java} (a,,a,) (b,,b,) (c,,c,) (d,,d,) (f,,f,) {code} Difference here is, for the latter, a2 in "FLATTEN(a2)" only has schema of tuple() with empty inner fields. -- This message was sent by Atlassian Jira (v8.20.10#820010)