[ https://issues.apache.org/jira/browse/PIG-2537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai updated PIG-2537: ---------------------------- Attachment: PIG-2537-1.patch > Output from flatten with a null tuple input generating data inconsistent with > the schema > ---------------------------------------------------------------------------------------- > > Key: PIG-2537 > URL: https://issues.apache.org/jira/browse/PIG-2537 > Project: Pig > Issue Type: Bug > Components: impl > Affects Versions: 0.8.0, 0.9.0 > Reporter: Xuefu Zhang > Assignee: Daniel Dai > Fix For: 0.10 > > Attachments: PIG-2537-1.patch > > > For the following pig script, > grunt> A = load 'file' as ( a : tuple( x, y, z ), b, c ); > grunt> B = foreach A generate flatten( $0 ), b, c; > grunt> describe B; > B: {a::x: bytearray,a::y: bytearray,a::z: bytearray,b: bytearray,c: bytearray} > Alias B has a clear schema. > However, on the backend, for a row if $0 happens to be null, then output > tuple become something like > (null, b_value, c_value), which is obviously inconsistent with the schema. > The behaviour is confirmed by pig code inspection. > This inconsistency corrupts data because of position shifts. Expected output > row should be something like > (null, null, null, b_value, c_value). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira