[ 
https://issues.apache.org/jira/browse/HIVE-11320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14720897#comment-14720897
 ] 

Sergey Shelukhin commented on HIVE-11320:
-----------------------------------------

ORC split generation tries to eliminate stripes from splits entirely using 
stripe statistics (if this is enabled).
Right now, if it seems deltas it doesn't do that because deltas can change 
values, so the elimination of a particular stripe will no longer be valid.
If insert-only deltas are used, I guess you can still eliminate?

> ACID enable predicate pushdown for insert-only delta file
> ---------------------------------------------------------
>
>                 Key: HIVE-11320
>                 URL: https://issues.apache.org/jira/browse/HIVE-11320
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>    Affects Versions: 1.0.0
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>             Fix For: 1.3.0
>
>         Attachments: HIVE-11320.patch
>
>
> Given ACID table T against which some Insert/Update/Delete has been executed 
> but not Major Compaction.
> This table will have some number of delta files.  (and possibly base files).
> Given a query: select * from T where c1 = 5;
> OrcRawRecordMerger() c'tor currently disables predicate pushdown in ORC to 
> the delta file via  eventOptions.searchArgument(null, null);
> When a delta file is known to only have Insert events we can safely push the 
> predicate.  
> ORC maintains stats in a footer which have counts of insert/update/delete 
> events in the file - this can be used to determine that a given delta file 
> only has Insert events.
> See OrcRecordUpdate.parseAcidStats()
> This will enable PPD for Streaming Ingest (HIVE-5687) use cases which by 
> definition only generate Insert events. 
> PPD for deltas with arbitrary types of events can be achieved but it is more 
> complicated and will be addressed separately.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to