Koji Noguchi created PIG-5452:
---------------------------------
Summary: Null handling of FLATTEN with user defined schema (as
clause)
Key: PIG-5452
URL: https://issues.apache.org/jira/browse/PIG-5452
Project: Pig
Issue Type: Bug
Reporter: Koji Noguchi
Assignee: Koji Noguchi
Follow up from PIG-5201,
{code:java}
A = load 'input' as (a1:chararray);
B = FOREACH A GENERATE a1, null as a2:tuple(A1:chararray, A2:chararray), a1 as
a3;
C = FOREACH B GENERATE a1, FLATTEN(a2), a3;
dump C;{code}
This produces right number of nulls.
{code:java}
(a,,,a)
(b,,,b)
(c,,,c)
(d,,,d)
(f,,,f) {code}
However,
{code:java}
A = load 'input.txt' as (a1:chararray);
B = FOREACH A GENERATE a1, null as a2:tuple(), a1 as a3;
C = FOREACH B GENERATE a1, FLATTEN(a2) as (A1:chararray, A2:chararray), a3;
dump C;{code}
This produces wrong number of null and the output is shifted incorrectly.
{code:java}
(a,,a,)
(b,,b,)
(c,,c,)
(d,,d,)
(f,,f,) {code}
Difference here is, for the latter, a2 in "FLATTEN(a2)" only has schema of
tuple() with empty inner fields.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)