Koji Noguchi created PIG-5452:
---------------------------------

             Summary: Null handling of FLATTEN with user defined schema (as 
clause)
                 Key: PIG-5452
                 URL: https://issues.apache.org/jira/browse/PIG-5452
             Project: Pig
          Issue Type: Bug
            Reporter: Koji Noguchi
            Assignee: Koji Noguchi


Follow up from PIG-5201, 
{code:java}
A = load 'input' as (a1:chararray);
B = FOREACH A GENERATE a1, null as a2:tuple(A1:chararray, A2:chararray), a1 as 
a3;
C = FOREACH B GENERATE a1, FLATTEN(a2), a3;
dump C;{code}
This produces right number of nulls.


{code:java}
(a,,,a)
(b,,,b)
(c,,,c)
(d,,,d)
(f,,,f) {code}
 

However, 
{code:java}
A = load 'input.txt' as (a1:chararray);
B = FOREACH A GENERATE a1, null as a2:tuple(), a1 as a3;
C = FOREACH B GENERATE a1, FLATTEN(a2) as (A1:chararray, A2:chararray), a3;
dump C;{code}
This produces wrong number of null and the output is shifted incorrectly. 
{code:java}
(a,,a,)
(b,,b,)
(c,,c,)
(d,,d,)
(f,,f,) {code}
Difference here is, for the latter, a2 in "FLATTEN(a2)" only has schema of 
tuple() with empty inner fields.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to