Output from flatten with a null tuple input generating data inconsistent with 
the schema
----------------------------------------------------------------------------------------

                 Key: PIG-2537
                 URL: https://issues.apache.org/jira/browse/PIG-2537
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: 0.9.0, 0.8.0
            Reporter: Xuefu Zhang
            Assignee: Alan Gates


For the following pig script,

grunt> A = load 'file' as ( a : tuple( x, y, z ), b, c );
grunt> B = foreach A generate flatten( $0 ), b, c;
grunt> describe B;
B: {a::x: bytearray,a::y: bytearray,a::z: bytearray,b: bytearray,c: bytearray}

Alias B has a clear schema.

However, on the backend, for a row if $0 happens to be null, then output tuple 
become something like 
(null, b_value, c_value), which is obviously inconsistent with the schema. The 
behaviour is confirmed by pig code inspection. 

This inconsistency corrupts data because of position shifts. Expected output 
row should be something like
(null, null, null, b_value, c_value).


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to