[ https://issues.apache.org/jira/browse/PIG-2067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032723#comment-13032723 ]
Daniel Dai commented on PIG-2067: --------------------------------- The format mess up: Here is the logical plan: {code} E: (Name: LOStore Schema: group#29:bytearray,A#30:bag{#240:tuple(cookie#10:bytearray)},B#32:bag{#241:tuple(cookie#11:bytearray)}) | |---E: (Name: LOFilter Schema: group#29:bytearray,A#30:bag{#240:tuple(cookie#10:bytearray)},B#32:bag{#241:tuple(cookie#11:bytearray)}) | | | (Name: And Type: boolean Uid: 41) | | | |---(Name: GreaterThan Type: boolean Uid: 37) | | | | | |---(Name: UserFunc(org.apache.pig.builtin.COUNT) Type: long Uid: 34) | | | | | | | |---B:(Name: Project Type: bag Uid: 32 Input: 0 Column: 2) | | | | | |---(Name: Cast Type: long Uid: 35) | | | | | |---(Name: Constant Type: int Uid: 35) | | | |---(Name: GreaterThan Type: boolean Uid: 40) | | | |---(Name: Cast Type: int Uid: 29) | | | | | |---group:(Name: Project Type: bytearray Uid: 29 Input: 0 Column: 0) | | | |---(Name: Constant Type: int Uid: 39) | |---C: (Name: LOCogroup Schema: group#29:bytearray,A#30:bag{#240:tuple(cookie#10:bytearray)},B#32:bag{#241:tuple(cookie#11:bytearray)}) | | | cookie:(Name: Project Type: bytearray Uid: 10 Input: 0 Column: 0) | | | cookie:(Name: Project Type: bytearray Uid: 11 Input: 1 Column: 0) | |---A: (Name: LOLoad Schema: cookie#10:bytearray)RequiredFields:null | |---B: (Name: LOLoad Schema: cookie#11:bytearray)RequiredFields:null {code} One branch of GreaterThan is on group rather than A. > FilterLogicExpressionSimplifier mess up uid in some cases > --------------------------------------------------------- > > Key: PIG-2067 > URL: https://issues.apache.org/jira/browse/PIG-2067 > Project: Pig > Issue Type: Bug > Components: impl > Affects Versions: 0.8.1, 0.9.0 > Reporter: Daniel Dai > Assignee: Daniel Dai > Fix For: 0.8.1, 0.9.0 > > > The following script produce wrong result: > {code} > A = load 'a.dat' as (cookie); > B = load 'b.dat' as (cookie); > C = cogroup A by cookie, B by cookie; > E = filter C by COUNT(B)>0 AND group>0; > explain E; > {code} > a.dat: > 1 1 > 2 2 > 3 3 > 4 4 > 5 5 > 6 6 > 7 7 > b.dat: > 3 3 > 4 4 > 5 5 > 6 6 > 7 7 > 8 8 > Expected output: > (3,{(3)},{(3)}) > (4,{(4)},{(4)}) > (5,{(5)},{(5)}) > (6,{(6)},{(6)}) > (7,{(7)},{(7)}) > We get: > (3,{(3)},{(3)}) > (4,{(4)},{(4)}) > (5,{(5)},{(5)}) > (6,{(6)},{(6)}) > (7,{(7)},{(7)}) > (8,{},{(8)}) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira