[ 
https://issues.apache.org/jira/browse/PIG-2067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032774#comment-13032774
 ] 

Daniel Dai commented on PIG-2067:
---------------------------------

This issue happens when:
1. We have AND in filter plan
2. Two branch of AND is the same UDF, but the input for the UDF is different

LogicExpressionSimplifier will erroneously believe two branches are the same 
and merge them.

> FilterLogicExpressionSimplifier removed some branches in some cases
> -------------------------------------------------------------------
>
>                 Key: PIG-2067
>                 URL: https://issues.apache.org/jira/browse/PIG-2067
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.1, 0.9.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.8.1, 0.9.0
>
>         Attachments: PIG-2067-1.patch
>
>
> The following script produce wrong result:
> {code}
> A = load 'a.dat' as (cookie);
> B = load 'b.dat' as (cookie);
> C = cogroup A by cookie, B by cookie;
> E = filter C by COUNT(B)>0 AND COUNT(A)>0;
> explain E;
> {code}
> a.dat:
> 1       1
> 2       2
> 3       3
> 4       4
> 5       5
> 6       6
> 7       7
> b.dat:
> 3       3
> 4       4
> 5       5
> 6       6
> 7       7
> 8       8
> Expected output:
> (3,{(3)},{(3)})
> (4,{(4)},{(4)})
> (5,{(5)},{(5)})
> (6,{(6)},{(6)})
> (7,{(7)},{(7)})
> We get:
> (3,{(3)},{(3)})
> (4,{(4)},{(4)})
> (5,{(5)},{(5)})
> (6,{(6)},{(6)})
> (7,{(7)},{(7)})
> (8,{},{(8)})

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to