Incorrect data generated by diff of SUM ---------------------------------------
Key: PIG-1525 URL: https://issues.apache.org/jira/browse/PIG-1525 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Richard Ding Assignee: Richard Ding Fix For: 0.8.0 Given data; input1: {code} id9 0 {code} input2: {code} id8 1 id9 1 {code} Pig script {code} A = LOAD 'input1' AS (id:chararray, val:long); B = LOAD 'input2' AS (id:chararray, val:long); C = COGROUP A BY id, B BY id; D = FOREACH C GENERATE group, SUM(B.val), SUM(A.val), (SUM(A.val) - SUM(B.val)); dump D; {code} generates incorrect data: {code} (id8,1L,,) (id9,1L,0L,-2L) {code} The workaround is to replace the FOREACH statement with {code} D = FOREACH C GENERATE group, SUM(B.val) as b, SUM(A.val) as a; E = FOREACH D GENERATE $0, b, a, (a-b); {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.