Shardul Mahadik created ORC-623:
-----------------------------------
Summary: Potentially incorrect Sarg evaluation for not(in) and
not(isNull)
Key: ORC-623
URL: https://issues.apache.org/jira/browse/ORC-623
Project: ORC
Issue Type: Bug
Reporter: Shardul Mahadik
I seem to have stumbled upon two issues with respect to Sarg evaluation in ORC
I have created two test cases at
[https://github.com/shardulm94/orc/commit/b6d97cfa0325d2a14094456d338c942f61b887f2]
for the same
In the first case, applying {{not(isNull(column))}} on a column that has all
null values seems to incorrectly mark the row group as needed. This is a rather
benign issue though as some extra row groups are returned.
In the second case, I create a column which has only 2 potential values, either
null or 1 based on whether the row index is even or odd. So all row groups are
guaranteed to have both null and 1. Applying {{not(in(column, 1))}} on this
column incorrectly marks the row group as not needed. There are null values in
the row group which should be matched by {{notIn(column, 1)}}. This is
potentially causing some row groups to be filtered out incorrectly.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)