[ https://issues.apache.org/jira/browse/PIG-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12896637#action_12896637 ]
Thejas M Nair commented on PIG-1525: ------------------------------------ +1 > Incorrect data generated by diff of SUM > --------------------------------------- > > Key: PIG-1525 > URL: https://issues.apache.org/jira/browse/PIG-1525 > Project: Pig > Issue Type: Bug > Affects Versions: 0.7.0 > Reporter: Richard Ding > Assignee: Richard Ding > Fix For: 0.8.0 > > Attachments: PIG-1525.patch, PIG-1525_1.patch > > > Given data; > input1: > {code} > id9 0 > {code} > input2: > {code} > id8 1 > id9 1 > {code} > Pig script > {code} > A = LOAD 'input1' AS (id:chararray, val:long); > B = LOAD 'input2' AS (id:chararray, val:long); > C = COGROUP A BY id, B BY id; > D = FOREACH C GENERATE group, SUM(B.val), SUM(A.val), (SUM(A.val) - > SUM(B.val)); > dump D; > {code} > generates incorrect data: > {code} > (id8,1L,,) > (id9,1L,0L,-2L) > {code} > The workaround is to replace the FOREACH statement with > {code} > D = FOREACH C GENERATE group, SUM(B.val) as b, SUM(A.val) as a; > E = FOREACH D GENERATE $0, b, a, (a-b); > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.