Igor Kryvenko created HIVE-20681:
------------------------------------

             Summary: Support custom path filter for ORC tables
                 Key: HIVE-20681
                 URL: https://issues.apache.org/jira/browse/HIVE-20681
             Project: Hive
          Issue Type: Improvement
          Components: Transactions
            Reporter: Igor Kryvenko
            Assignee: Igor Kryvenko


Currently, Orc file input format does not take in path filters set in the 
property "mapreduce.input.pathfilter.class" OR " mapred.input.pathfilter.class 
". So, we cannot use custom filters with Orc files.

AcidUtils class has a static filter called "hiddenFilters" which is used by ORC 
to filter input paths. If we can pass the custom filter classes(set in the 
property mentioned above) to AcidUtils and replace hiddenFilter with a filter 
that does an "and" operation over hiddenFilter+customFilters, the filters would 
work well.

It would be useful to have the ability to filter out rows based on 
path/filenames, current ORC features like bloom filters and indexes are not 
good enough for them to minimize the number of disk read operations.






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to