Output from flatten with a null tuple input generating data inconsistent with
the schema
----------------------------------------------------------------------------------------
Key: PIG-2537
URL: https://issues.apache.org/jira/browse/PIG-2537
Project: Pig
Issue Type: Bug
Components: impl
Affects Versions: 0.9.0, 0.8.0
Reporter: Xuefu Zhang
Assignee: Alan Gates
For the following pig script,
grunt> A = load 'file' as ( a : tuple( x, y, z ), b, c );
grunt> B = foreach A generate flatten( $0 ), b, c;
grunt> describe B;
B: {a::x: bytearray,a::y: bytearray,a::z: bytearray,b: bytearray,c: bytearray}
Alias B has a clear schema.
However, on the backend, for a row if $0 happens to be null, then output tuple
become something like
(null, b_value, c_value), which is obviously inconsistent with the schema. The
behaviour is confirmed by pig code inspection.
This inconsistency corrupts data because of position shifts. Expected output
row should be something like
(null, null, null, b_value, c_value).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira