Igor Kryvenko created HIVE-20681:
------------------------------------
Summary: Support custom path filter for ORC tables
Key: HIVE-20681
URL: https://issues.apache.org/jira/browse/HIVE-20681
Project: Hive
Issue Type: Improvement
Components: Transactions
Reporter: Igor Kryvenko
Assignee: Igor Kryvenko
Currently, Orc file input format does not take in path filters set in the
property "mapreduce.input.pathfilter.class" OR " mapred.input.pathfilter.class
". So, we cannot use custom filters with Orc files.
AcidUtils class has a static filter called "hiddenFilters" which is used by ORC
to filter input paths. If we can pass the custom filter classes(set in the
property mentioned above) to AcidUtils and replace hiddenFilter with a filter
that does an "and" operation over hiddenFilter+customFilters, the filters would
work well.
It would be useful to have the ability to filter out rows based on
path/filenames, current ORC features like bloom filters and indexes are not
good enough for them to minimize the number of disk read operations.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)