[jira] [Commented] (HIVE-11320) ACID enable predicate pushdown for insert-only delta file

2015-08-28 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720895#comment-14720895
 ] 

Eugene Koifman commented on HIVE-11320:
---

I'm not sure I follow, can you explain

 ACID enable predicate pushdown for insert-only delta file
 -

 Key: HIVE-11320
 URL: https://issues.apache.org/jira/browse/HIVE-11320
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Fix For: 1.3.0

 Attachments: HIVE-11320.patch


 Given ACID table T against which some Insert/Update/Delete has been executed 
 but not Major Compaction.
 This table will have some number of delta files.  (and possibly base files).
 Given a query: select * from T where c1 = 5;
 OrcRawRecordMerger() c'tor currently disables predicate pushdown in ORC to 
 the delta file via  eventOptions.searchArgument(null, null);
 When a delta file is known to only have Insert events we can safely push the 
 predicate.  
 ORC maintains stats in a footer which have counts of insert/update/delete 
 events in the file - this can be used to determine that a given delta file 
 only has Insert events.
 See OrcRecordUpdate.parseAcidStats()
 This will enable PPD for Streaming Ingest (HIVE-5687) use cases which by 
 definition only generate Insert events. 
 PPD for deltas with arbitrary types of events can be achieved but it is more 
 complicated and will be addressed separately.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11320) ACID enable predicate pushdown for insert-only delta file

2015-08-28 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720886#comment-14720886
 ] 

Sergey Shelukhin commented on HIVE-11320:
-

I wonder if it would also apply to stripe elimination during split generation...

 ACID enable predicate pushdown for insert-only delta file
 -

 Key: HIVE-11320
 URL: https://issues.apache.org/jira/browse/HIVE-11320
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Fix For: 1.3.0

 Attachments: HIVE-11320.patch


 Given ACID table T against which some Insert/Update/Delete has been executed 
 but not Major Compaction.
 This table will have some number of delta files.  (and possibly base files).
 Given a query: select * from T where c1 = 5;
 OrcRawRecordMerger() c'tor currently disables predicate pushdown in ORC to 
 the delta file via  eventOptions.searchArgument(null, null);
 When a delta file is known to only have Insert events we can safely push the 
 predicate.  
 ORC maintains stats in a footer which have counts of insert/update/delete 
 events in the file - this can be used to determine that a given delta file 
 only has Insert events.
 See OrcRecordUpdate.parseAcidStats()
 This will enable PPD for Streaming Ingest (HIVE-5687) use cases which by 
 definition only generate Insert events. 
 PPD for deltas with arbitrary types of events can be achieved but it is more 
 complicated and will be addressed separately.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11320) ACID enable predicate pushdown for insert-only delta file

2015-07-21 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14635223#comment-14635223
 ] 

Eugene Koifman commented on HIVE-11320:
---

[~alangates], could you review please

 ACID enable predicate pushdown for insert-only delta file
 -

 Key: HIVE-11320
 URL: https://issues.apache.org/jira/browse/HIVE-11320
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-11320.patch


 Given ACID table T against which some Insert/Update/Delete has been executed 
 but not Major Compaction.
 This table will have some number of delta files.  (and possibly base files).
 Given a query: select * from T where c1 = 5;
 OrcRawRecordMerger() c'tor currently disables predicate pushdown in ORC to 
 the delta file via  eventOptions.searchArgument(null, null);
 When a delta file is known to only have Insert events we can safely push the 
 predicate.  
 ORC maintains stats in a footer which have counts of insert/update/delete 
 events in the file - this can be used to determine that a given delta file 
 only has Insert events.
 See OrcRecordUpdate.parseAcidStats()
 This will enable PPD for Streaming Ingest (HIVE-5687) use cases which by 
 definition only generate Insert events. 
 PPD for deltas with arbitrary types of events can be achieved but it is more 
 complicated and will be addressed separately.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11320) ACID enable predicate pushdown for insert-only delta file

2015-07-21 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14635403#comment-14635403
 ] 

Alan Gates commented on HIVE-11320:
---

+1

 ACID enable predicate pushdown for insert-only delta file
 -

 Key: HIVE-11320
 URL: https://issues.apache.org/jira/browse/HIVE-11320
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-11320.patch


 Given ACID table T against which some Insert/Update/Delete has been executed 
 but not Major Compaction.
 This table will have some number of delta files.  (and possibly base files).
 Given a query: select * from T where c1 = 5;
 OrcRawRecordMerger() c'tor currently disables predicate pushdown in ORC to 
 the delta file via  eventOptions.searchArgument(null, null);
 When a delta file is known to only have Insert events we can safely push the 
 predicate.  
 ORC maintains stats in a footer which have counts of insert/update/delete 
 events in the file - this can be used to determine that a given delta file 
 only has Insert events.
 See OrcRecordUpdate.parseAcidStats()
 This will enable PPD for Streaming Ingest (HIVE-5687) use cases which by 
 definition only generate Insert events. 
 PPD for deltas with arbitrary types of events can be achieved but it is more 
 complicated and will be addressed separately.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)