[ https://issues.apache.org/jira/browse/PIG-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Thejas M Nair updated PIG-2316: ------------------------------- Attachment: pig-2316-trunk-v2.txt {code} Applying pig-2316-trunk-v1.txt triggers another bug. For the following filter clause, note that filter plan in MR plan is incomplete. B = FILTER A BY ((col1==1) OR (col1 != 2)); Filter in MR plan - B: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-11 | |---B: Filter[bag] - scope-7 | | | Not Equal To[boolean] - scope-10 | | | |---Project[int][0] - scope-8 | | | |---Constant(2) - scope-9 | |---A: New For Each(false,false)[bag] - scope-6 {code} pig-2316-trunk-v2.txt has the fix for this issue. > Incorrect results for FILTER *** BY ( *** OR ***) with > FilterLogicExpressionSimplifier optimizer turned on > ---------------------------------------------------------------------------------------------------------- > > Key: PIG-2316 > URL: https://issues.apache.org/jira/browse/PIG-2316 > Project: Pig > Issue Type: Bug > Affects Versions: 0.8.0, 0.8.1, 0.9.0, 0.9.1 > Reporter: Huanyu Zhao > Priority: Critical > Fix For: 0.8.1, 0.9.2 > > Attachments: pig-2316-trunk-v1.txt, pig-2316-trunk-v2.txt > > > An example for this bug: > cat weird.txt > 1,a > 2,b > 3,c > When running pig with the following statements: > A = LOAD 'weird.txt' using PigStorage(',') AS (col1:int,col2); > B = FILTER A BY ((col1==1) OR (col1 != 1)); > DUMP B; > I expect to get the result of all three rows back, but I receive only two > rows. > (2,b) > (3,c) > When we start pig with optimizer turning off. > pig -optimizer_off All > With optimizer turning off, we get the expected results and I get three rows > for the same statements. > (1,a) > (2,b) > (3,c) > -------------------------------------------------------- > This bug was test on: > pig-0.9.1, > pig-0.9.0, > pig-0.8.1, > pig-0.8.0 > All produced same incorrect results. > -------------------------------------------------------- > When looked at the logical plan for this example, we found > FilterlogicExpressionSimplifier optimizer produced incorrect logical plan. So > we guess the bug is caused by FilterlogicExpressionSimplifier optimizer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira