Eugene Koifman created HIVE-11320:
-------------------------------------
Summary: ACID enable predicate pushdown for insert-only delta file
Key: HIVE-11320
URL: https://issues.apache.org/jira/browse/HIVE-11320
Project: Hive
Issue Type: Bug
Components: Transactions
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
Given ACID table T against which some Insert/Update/Delete has been executed
but not Major Compaction.
This table will have some number of delta files. (and possibly base files).
Given a query: select * from T where c1 = 5;
OrcRawRecordMerger() c'tor currently disables predicate pushdown in ORC to the
delta file via eventOptions.searchArgument(null, null);
When a delta file is known to only have Insert events we can safely push the
predicate.
ORC maintains stats in a footer which have counts of insert/update/delete
events in the file - this can be used to determine that a given delta file only
has Insert events.
See OrcRecordUpdate.parseAcidStats()
This will enable PPD for Streaming Ingest (HIVE-5687) use cases which by
definition only generate Insert events.
PPD for deltas with arbitrary types of events can be achieved but it is more
complicated and will be addressed separately.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)