Shardul Mahadik created ORC-623:
-----------------------------------

             Summary: Potentially incorrect Sarg evaluation for not(in) and 
not(isNull)
                 Key: ORC-623
                 URL: https://issues.apache.org/jira/browse/ORC-623
             Project: ORC
          Issue Type: Bug
            Reporter: Shardul Mahadik


I seem to have stumbled upon two issues with respect to Sarg evaluation in ORC

I have created two test cases at 
[https://github.com/shardulm94/orc/commit/b6d97cfa0325d2a14094456d338c942f61b887f2]
 for the same

In the first case, applying {{not(isNull(column))}} on a column that has all 
null values seems to incorrectly mark the row group as needed. This is a rather 
benign issue though as some extra row groups are returned.

In the second case, I create a column which has only 2 potential values, either 
null or 1 based on whether the row index is even or odd. So all row groups are 
guaranteed to have both null and 1. Applying {{not(in(column, 1))}} on this 
column incorrectly marks the row group as not needed. There are null values in 
the row group which should be matched by {{notIn(column, 1)}}. This is 
potentially causing some row groups to be filtered out incorrectly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to