[ 
https://issues.apache.org/jira/browse/PIG-2537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-2537:
----------------------------

    Fix Version/s:     (was: 0.10)
                   0.11

Discussed with Thejas, Alan, we might need to research more and find the best 
way to solve the problem. Unlink from 0.10.
                
> Output from flatten with a null tuple input generating data inconsistent with 
> the schema
> ----------------------------------------------------------------------------------------
>
>                 Key: PIG-2537
>                 URL: https://issues.apache.org/jira/browse/PIG-2537
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0, 0.9.0
>            Reporter: Xuefu Zhang
>            Assignee: Daniel Dai
>             Fix For: 0.11
>
>         Attachments: PIG-2537-1.patch, PIG-2537-2.patch, PIG-2537-3.patch
>
>
> For the following pig script,
> grunt> A = load 'file' as ( a : tuple( x, y, z ), b, c );
> grunt> B = foreach A generate flatten( $0 ), b, c;
> grunt> describe B;
> B: {a::x: bytearray,a::y: bytearray,a::z: bytearray,b: bytearray,c: bytearray}
> Alias B has a clear schema.
> However, on the backend, for a row if $0 happens to be null, then output 
> tuple become something like 
> (null, b_value, c_value), which is obviously inconsistent with the schema. 
> The behaviour is confirmed by pig code inspection. 
> This inconsistency corrupts data because of position shifts. Expected output 
> row should be something like
> (null, null, null, b_value, c_value).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to