Re: Review Request 49766: HIVE-14035 Enable predicate pushdown to delta files created by ACID Transactions

2016-08-08 Thread Saket Saurabh


> On Aug. 4, 2016, 2:29 p.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java, line 399
> > 
> >
> > rowId is always -1 here. Intended?

@Sergey: Thanks for pointing this out. It was a bug and not intended.


- Saket


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49766/#review144816
---


On Aug. 8, 2016, 5:53 p.m., Saket Saurabh wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49766/
> ---
> 
> (Updated Aug. 8, 2016, 5:53 p.m.)
> 
> 
> Review request for hive and Eugene Koifman.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-14035
> 
> In current Hive version, delta files created by ACID transactions do not 
> allow predicate pushdown if they contain any update/delete events. This is 
> done to preserve correctness when following a multi-version approach during 
> event collapsing, where an update event overwrites an existing insert event. 
> This JIRA proposes to split an update event into a combination of a delete 
> event followed by a new insert event, that can enable predicate push down to 
> all delta files without breaking correctness. To support backward 
> compatibility for this feature, this JIRA also proposes to add some sort of 
> versioning to ACID that can allow different versions of ACID transactions to 
> co-exist together.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 70816bd 
>   
> hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FosterStorageHandler.java
>  14f7316 
>   
> hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java
>  974c6b8 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
>  731caa8 
>   metastore/if/hive_metastore.thrift a2e35b8 
>   metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.h ae14bd1 
>   metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.cpp f982bf2 
>   
> metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/hive_metastoreConstants.java
>  5a666f2 
>   metastore/src/gen/thrift/gen-php/metastore/Types.php d6f7f49 
>   metastore/src/gen/thrift/gen-py/hive_metastore/constants.py d1c07a5 
>   metastore/src/gen/thrift/gen-rb/hive_metastore_constants.rb eeccc84 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/TransactionalValidationListener.java
>  3e74675 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FetchTask.java db6848a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java 57b6c67 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java 26e6443 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidOutputFormat.java dd90a95 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 449d889 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 945b828 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 334cb31 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
> efde2db 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 1a1af28 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 8cb5e8a 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 8cf261d 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> 6caca98 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java af192fb 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands3.java PRE-CREATION 
>   ql/src/test/org/apache/hadoop/hive/ql/io/TestAcidUtils.java 556df18 
>   ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInputOutputFormat.java 
> 6648829 
> 
> Diff: https://reviews.apache.org/r/49766/diff/
> 
> 
> Testing
> ---
> 
> Tests for the feature are in 
> ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java. These are mostly 
> integration tests that test end-to-end insert/update/delete scenarios 
> followed by compaction and cleaning.
> 
> 
> Thanks,
> 
> Saket Saurabh
> 
>



Re: Review Request 49766: HIVE-14035 Enable predicate pushdown to delta files created by ACID Transactions

2016-08-08 Thread Saket Saurabh


> On Aug. 4, 2016, 2:29 p.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java, line 588
> > 
> >
> > this check is not done in the above "if" before adding the statementId 
> > to "last". What does this check mean i.e. what do negative numbers mean?

@Sergey, this change is now no longer part of this patch. It got refactored 
away when I merged serializeDelta() function with serializeDeleteDelta(). 
Although in fact, it exists in current master as well. The negative numbers 
mean that we do not have a statement id for this delta. For example, when we 
run a minor compaction we produce delta of the form delta_x_y which has no 
statement id attached to it. In these cases, statement id would be -1.


- Saket


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49766/#review144816
---


On Aug. 8, 2016, 5:53 p.m., Saket Saurabh wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49766/
> ---
> 
> (Updated Aug. 8, 2016, 5:53 p.m.)
> 
> 
> Review request for hive and Eugene Koifman.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-14035
> 
> In current Hive version, delta files created by ACID transactions do not 
> allow predicate pushdown if they contain any update/delete events. This is 
> done to preserve correctness when following a multi-version approach during 
> event collapsing, where an update event overwrites an existing insert event. 
> This JIRA proposes to split an update event into a combination of a delete 
> event followed by a new insert event, that can enable predicate push down to 
> all delta files without breaking correctness. To support backward 
> compatibility for this feature, this JIRA also proposes to add some sort of 
> versioning to ACID that can allow different versions of ACID transactions to 
> co-exist together.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 70816bd 
>   
> hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FosterStorageHandler.java
>  14f7316 
>   
> hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java
>  974c6b8 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
>  731caa8 
>   metastore/if/hive_metastore.thrift a2e35b8 
>   metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.h ae14bd1 
>   metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.cpp f982bf2 
>   
> metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/hive_metastoreConstants.java
>  5a666f2 
>   metastore/src/gen/thrift/gen-php/metastore/Types.php d6f7f49 
>   metastore/src/gen/thrift/gen-py/hive_metastore/constants.py d1c07a5 
>   metastore/src/gen/thrift/gen-rb/hive_metastore_constants.rb eeccc84 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/TransactionalValidationListener.java
>  3e74675 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FetchTask.java db6848a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java 57b6c67 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java 26e6443 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidOutputFormat.java dd90a95 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 449d889 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 945b828 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 334cb31 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
> efde2db 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 1a1af28 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 8cb5e8a 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 8cf261d 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> 6caca98 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java af192fb 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands3.java PRE-CREATION 
>   ql/src/test/org/apache/hadoop/hive/ql/io/TestAcidUtils.java 556df18 
>   ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInputOutputFormat.java 
> 6648829 
> 
> Diff: https://reviews.apache.org/r/49766/diff/
> 
> 
> Testing
> ---
> 
> Tests for the feature are in 
> ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java. These are mostly 
> integration tests that test end-to-end insert/update/delete scenarios 
> followed by compaction and cleaning.
> 
> 
> Thanks,
> 
> Saket Saurabh
> 
>



Re: Review Request 49766: HIVE-14035 Enable predicate pushdown to delta files created by ACID Transactions

2016-08-08 Thread Saket Saurabh

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49766/
---

(Updated Aug. 8, 2016, 5:53 p.m.)


Review request for hive and Eugene Koifman.


Changes
---

Updated patch with comments/feedback from code review. 
Major changes are refactoring of AcidUtils.getAcidState(), CompactorMR, fixing 
of flush length related changes in OrcRecordUpdater, lazy initialization of 
deleteEventWriters and other minor changes. 
The patch adds additional unit tests to test schema evolution with ACID. It 
also adds a subclass to TestTxnCommands2 for replicating all the existing ACID 
test cases with split-update turned on.


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-14035

In current Hive version, delta files created by ACID transactions do not allow 
predicate pushdown if they contain any update/delete events. This is done to 
preserve correctness when following a multi-version approach during event 
collapsing, where an update event overwrites an existing insert event. 
This JIRA proposes to split an update event into a combination of a delete 
event followed by a new insert event, that can enable predicate push down to 
all delta files without breaking correctness. To support backward compatibility 
for this feature, this JIRA also proposes to add some sort of versioning to 
ACID that can allow different versions of ACID transactions to co-exist 
together.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 70816bd 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FosterStorageHandler.java
 14f7316 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java
 974c6b8 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
 731caa8 
  metastore/if/hive_metastore.thrift a2e35b8 
  metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.h ae14bd1 
  metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.cpp f982bf2 
  
metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/hive_metastoreConstants.java
 5a666f2 
  metastore/src/gen/thrift/gen-php/metastore/Types.php d6f7f49 
  metastore/src/gen/thrift/gen-py/hive_metastore/constants.py d1c07a5 
  metastore/src/gen/thrift/gen-rb/hive_metastore_constants.rb eeccc84 
  
metastore/src/java/org/apache/hadoop/hive/metastore/TransactionalValidationListener.java
 3e74675 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FetchTask.java db6848a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java 57b6c67 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java 26e6443 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidOutputFormat.java dd90a95 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 449d889 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 945b828 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 334cb31 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java efde2db 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 1a1af28 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 8cb5e8a 
  ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 8cf261d 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 6caca98 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java af192fb 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands3.java PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/io/TestAcidUtils.java 556df18 
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInputOutputFormat.java 
6648829 

Diff: https://reviews.apache.org/r/49766/diff/


Testing
---

Tests for the feature are in 
ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java. These are mostly 
integration tests that test end-to-end insert/update/delete scenarios followed 
by compaction and cleaning.


Thanks,

Saket Saurabh



Re: Review Request 49766: HIVE-14035 Enable predicate pushdown to delta files created by ACID Transactions

2016-08-04 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49766/#review144829
---




ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java (line 135)


nit: why remove this comment?



ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java (line 527)


nit: true/false is redundant


- Sergey Shelukhin


On July 27, 2016, 9:54 p.m., Saket Saurabh wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49766/
> ---
> 
> (Updated July 27, 2016, 9:54 p.m.)
> 
> 
> Review request for hive and Eugene Koifman.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-14035
> 
> In current Hive version, delta files created by ACID transactions do not 
> allow predicate pushdown if they contain any update/delete events. This is 
> done to preserve correctness when following a multi-version approach during 
> event collapsing, where an update event overwrites an existing insert event. 
> This JIRA proposes to split an update event into a combination of a delete 
> event followed by a new insert event, that can enable predicate push down to 
> all delta files without breaking correctness. To support backward 
> compatibility for this feature, this JIRA also proposes to add some sort of 
> versioning to ACID that can allow different versions of ACID transactions to 
> co-exist together.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java e92466f 
>   
> hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FosterStorageHandler.java
>  14f7316 
>   
> hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java
>  974c6b8 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
>  ca2a912 
>   metastore/if/hive_metastore.thrift 4d92b73 
>   metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.h ae14bd1 
>   metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.cpp f982bf2 
>   
> metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/hive_metastoreConstants.java
>  5a666f2 
>   metastore/src/gen/thrift/gen-php/metastore/Types.php f505208 
>   metastore/src/gen/thrift/gen-py/hive_metastore/constants.py d1c07a5 
>   metastore/src/gen/thrift/gen-rb/hive_metastore_constants.rb eeccc84 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/TransactionalValidationListener.java
>  3e74675 
>   orc/src/java/org/apache/orc/impl/TreeReaderFactory.java c4a2093 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FetchTask.java db6848a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java 57b6c67 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java 23a13d6 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidOutputFormat.java dd90a95 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java c150ec5 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 945b828 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 63d02fb 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 1a1af28 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 9d927bd 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 8cf261d 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> 6caca98 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java d48e441 
>   ql/src/test/org/apache/hadoop/hive/ql/io/TestAcidUtils.java b83cea4 
> 
> Diff: https://reviews.apache.org/r/49766/diff/
> 
> 
> Testing
> ---
> 
> Tests for the feature are in 
> ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java. These are mostly 
> integration tests that test end-to-end insert/update/delete scenarios 
> followed by compaction and cleaning.
> 
> 
> Thanks,
> 
> Saket Saurabh
> 
>



Re: Review Request 49766: HIVE-14035 Enable predicate pushdown to delta files created by ACID Transactions

2016-08-04 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49766/#review144816
---




metastore/src/java/org/apache/hadoop/hive/metastore/TransactionalValidationListener.java
 (line 40)


why not move it here and depend on it there?



metastore/src/java/org/apache/hadoop/hive/metastore/TransactionalValidationListener.java
 (line 254)


nit since it's probably only a few values: if we iterate anyway, why wrap 
in a hashset? Also via keySet iterator one could probably remove without 
another lookup



ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java (line 361)


nit: log unsupported property?



ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java (line 578)


this check is not done in the above "if" before adding the statementId to 
"last". What does this check mean i.e. what do negative numbers mean?



ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java (line 864)


needs to log error. Also, useFileIds could be passed from the main call 
above if I'm not mistakes, so this can be skipped if we know we cannot get file 
IDs



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java (line 1790)


is split.isOriginal no longer valid for other paths too? It could be used 
elsewhere



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java (line 1987)


what does this change in behavior do? if base and original are both empty, 
was the behavior incorrect before? Can it happen?



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java (line 380)


what would row be in case of delete operation? I am assuming this works, it 
just looks surprising that there would be a row



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java (line 392)


rowId is always -1 here. Intended?



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java (line 484)


error handling? we don't need to fail if we cannot remove, right? same just 
below


- Sergey Shelukhin


On July 27, 2016, 9:54 p.m., Saket Saurabh wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49766/
> ---
> 
> (Updated July 27, 2016, 9:54 p.m.)
> 
> 
> Review request for hive and Eugene Koifman.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-14035
> 
> In current Hive version, delta files created by ACID transactions do not 
> allow predicate pushdown if they contain any update/delete events. This is 
> done to preserve correctness when following a multi-version approach during 
> event collapsing, where an update event overwrites an existing insert event. 
> This JIRA proposes to split an update event into a combination of a delete 
> event followed by a new insert event, that can enable predicate push down to 
> all delta files without breaking correctness. To support backward 
> compatibility for this feature, this JIRA also proposes to add some sort of 
> versioning to ACID that can allow different versions of ACID transactions to 
> co-exist together.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java e92466f 
>   
> hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FosterStorageHandler.java
>  14f7316 
>   
> hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java
>  974c6b8 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
>  ca2a912 
>   metastore/if/hive_metastore.thrift 4d92b73 
>   metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.h ae14bd1 
>   metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.cpp f982bf2 
>   
> metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/hive_metastoreConstants.java
>  5a666f2 
>   metastore/src/gen/thrift/gen-php/metastore/Types.php f505208 
>   metastore/src/gen/thrift/gen-py/hive_metastore/constants.py d1c07a5 
>   metastore/src/gen/thrift/gen-rb/hive_metastore_constants.rb eeccc84 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/TransactionalValidationListener.java
>  3e74675 
>   orc/src/java/org/apache/orc/impl/TreeReaderFactory.java c4a2093 
>   ql/src/java/org/apac

Re: Review Request 49766: HIVE-14035 Enable predicate pushdown to delta files created by ACID Transactions

2016-07-27 Thread Lefty Leverenz


> On July 21, 2016, 7:42 a.m., Lefty Leverenz wrote:
> >
> 
> Saket Saurabh wrote:
> @Lefty, does the updated description for this config variable seem to 
> explain better now?

Great, just what it needed!  Thanks Saket.

I recommend adding "the" or "a" in a few places and a final period, but these 
edits aren't really necessary so don't make a new patch just for them:

"versions of Hive ACID subsystem" --> "versions of the Hive ACID subsystem" 
(line 1768 of revision 4)

"enable split-update feature found in newer version" --> "enable a split-update 
feature found in the newer version" (line 1769)

"of Hive ACID subsystem. (See HIVE-14035 for details)" --> "of the Hive ACID 
subsystem. (See HIVE-14035 for details.)" (line 1770)


- Lefty


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49766/#review143059
---


On July 27, 2016, 9:54 p.m., Saket Saurabh wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49766/
> ---
> 
> (Updated July 27, 2016, 9:54 p.m.)
> 
> 
> Review request for hive and Eugene Koifman.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-14035
> 
> In current Hive version, delta files created by ACID transactions do not 
> allow predicate pushdown if they contain any update/delete events. This is 
> done to preserve correctness when following a multi-version approach during 
> event collapsing, where an update event overwrites an existing insert event. 
> This JIRA proposes to split an update event into a combination of a delete 
> event followed by a new insert event, that can enable predicate push down to 
> all delta files without breaking correctness. To support backward 
> compatibility for this feature, this JIRA also proposes to add some sort of 
> versioning to ACID that can allow different versions of ACID transactions to 
> co-exist together.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java e92466f 
>   
> hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FosterStorageHandler.java
>  14f7316 
>   
> hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java
>  974c6b8 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
>  ca2a912 
>   metastore/if/hive_metastore.thrift 4d92b73 
>   metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.h ae14bd1 
>   metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.cpp f982bf2 
>   
> metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/hive_metastoreConstants.java
>  5a666f2 
>   metastore/src/gen/thrift/gen-php/metastore/Types.php f505208 
>   metastore/src/gen/thrift/gen-py/hive_metastore/constants.py d1c07a5 
>   metastore/src/gen/thrift/gen-rb/hive_metastore_constants.rb eeccc84 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/TransactionalValidationListener.java
>  3e74675 
>   orc/src/java/org/apache/orc/impl/TreeReaderFactory.java c4a2093 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FetchTask.java db6848a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java 57b6c67 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java 23a13d6 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidOutputFormat.java dd90a95 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java c150ec5 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 945b828 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 63d02fb 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 1a1af28 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 9d927bd 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 8cf261d 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> 6caca98 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java d48e441 
>   ql/src/test/org/apache/hadoop/hive/ql/io/TestAcidUtils.java b83cea4 
> 
> Diff: https://reviews.apache.org/r/49766/diff/
> 
> 
> Testing
> ---
> 
> Tests for the feature are in 
> ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java. These are mostly 
> integration tests that test end-to-end insert/update/delete scenarios 
> followed by compaction and cleaning.
> 
> 
> Thanks,
> 
> Saket Saurabh
> 
>



Re: Review Request 49766: HIVE-14035 Enable predicate pushdown to delta files created by ACID Transactions

2016-07-27 Thread Saket Saurabh


> On July 21, 2016, 12:42 a.m., Lefty Leverenz wrote:
> >

@Lefty, does the updated description for this config variable seem to explain 
better now?


- Saket


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49766/#review143059
---


On July 27, 2016, 2:54 p.m., Saket Saurabh wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49766/
> ---
> 
> (Updated July 27, 2016, 2:54 p.m.)
> 
> 
> Review request for hive and Eugene Koifman.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-14035
> 
> In current Hive version, delta files created by ACID transactions do not 
> allow predicate pushdown if they contain any update/delete events. This is 
> done to preserve correctness when following a multi-version approach during 
> event collapsing, where an update event overwrites an existing insert event. 
> This JIRA proposes to split an update event into a combination of a delete 
> event followed by a new insert event, that can enable predicate push down to 
> all delta files without breaking correctness. To support backward 
> compatibility for this feature, this JIRA also proposes to add some sort of 
> versioning to ACID that can allow different versions of ACID transactions to 
> co-exist together.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java e92466f 
>   
> hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FosterStorageHandler.java
>  14f7316 
>   
> hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java
>  974c6b8 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
>  ca2a912 
>   metastore/if/hive_metastore.thrift 4d92b73 
>   metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.h ae14bd1 
>   metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.cpp f982bf2 
>   
> metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/hive_metastoreConstants.java
>  5a666f2 
>   metastore/src/gen/thrift/gen-php/metastore/Types.php f505208 
>   metastore/src/gen/thrift/gen-py/hive_metastore/constants.py d1c07a5 
>   metastore/src/gen/thrift/gen-rb/hive_metastore_constants.rb eeccc84 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/TransactionalValidationListener.java
>  3e74675 
>   orc/src/java/org/apache/orc/impl/TreeReaderFactory.java c4a2093 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FetchTask.java db6848a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java 57b6c67 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java 23a13d6 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidOutputFormat.java dd90a95 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java c150ec5 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 945b828 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 63d02fb 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 1a1af28 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 9d927bd 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 8cf261d 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> 6caca98 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java d48e441 
>   ql/src/test/org/apache/hadoop/hive/ql/io/TestAcidUtils.java b83cea4 
> 
> Diff: https://reviews.apache.org/r/49766/diff/
> 
> 
> Testing
> ---
> 
> Tests for the feature are in 
> ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java. These are mostly 
> integration tests that test end-to-end insert/update/delete scenarios 
> followed by compaction and cleaning.
> 
> 
> Thanks,
> 
> Saket Saurabh
> 
>



Re: Review Request 49766: HIVE-14035 Enable predicate pushdown to delta files created by ACID Transactions

2016-07-27 Thread Saket Saurabh

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49766/
---

(Updated July 27, 2016, 2:54 p.m.)


Review request for hive and Eugene Koifman.


Changes
---

Refactor the way delete event writers are created for compaction case in favor 
of a better abstraction.


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-14035

In current Hive version, delta files created by ACID transactions do not allow 
predicate pushdown if they contain any update/delete events. This is done to 
preserve correctness when following a multi-version approach during event 
collapsing, where an update event overwrites an existing insert event. 
This JIRA proposes to split an update event into a combination of a delete 
event followed by a new insert event, that can enable predicate push down to 
all delta files without breaking correctness. To support backward compatibility 
for this feature, this JIRA also proposes to add some sort of versioning to 
ACID that can allow different versions of ACID transactions to co-exist 
together.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java e92466f 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FosterStorageHandler.java
 14f7316 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java
 974c6b8 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
 ca2a912 
  metastore/if/hive_metastore.thrift 4d92b73 
  metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.h ae14bd1 
  metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.cpp f982bf2 
  
metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/hive_metastoreConstants.java
 5a666f2 
  metastore/src/gen/thrift/gen-php/metastore/Types.php f505208 
  metastore/src/gen/thrift/gen-py/hive_metastore/constants.py d1c07a5 
  metastore/src/gen/thrift/gen-rb/hive_metastore_constants.rb eeccc84 
  
metastore/src/java/org/apache/hadoop/hive/metastore/TransactionalValidationListener.java
 3e74675 
  orc/src/java/org/apache/orc/impl/TreeReaderFactory.java c4a2093 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FetchTask.java db6848a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java 57b6c67 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java 23a13d6 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidOutputFormat.java dd90a95 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java c150ec5 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 945b828 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 63d02fb 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 1a1af28 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 9d927bd 
  ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 8cf261d 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 6caca98 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java d48e441 
  ql/src/test/org/apache/hadoop/hive/ql/io/TestAcidUtils.java b83cea4 

Diff: https://reviews.apache.org/r/49766/diff/


Testing
---

Tests for the feature are in 
ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java. These are mostly 
integration tests that test end-to-end insert/update/delete scenarios followed 
by compaction and cleaning.


Thanks,

Saket Saurabh



Re: Review Request 49766: HIVE-14035 Enable predicate pushdown to delta files created by ACID Transactions

2016-07-26 Thread Saket Saurabh

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49766/
---

(Updated July 26, 2016, 11:30 a.m.)


Review request for hive and Eugene Koifman.


Changes
---

Add more UTs to specifically test AcidUtils and various other compaction 
scenarios.


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-14035

In current Hive version, delta files created by ACID transactions do not allow 
predicate pushdown if they contain any update/delete events. This is done to 
preserve correctness when following a multi-version approach during event 
collapsing, where an update event overwrites an existing insert event. 
This JIRA proposes to split an update event into a combination of a delete 
event followed by a new insert event, that can enable predicate push down to 
all delta files without breaking correctness. To support backward compatibility 
for this feature, this JIRA also proposes to add some sort of versioning to 
ACID that can allow different versions of ACID transactions to co-exist 
together.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java e92466f 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FosterStorageHandler.java
 14f7316 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java
 974c6b8 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
 ca2a912 
  metastore/if/hive_metastore.thrift 4d92b73 
  metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.h ae14bd1 
  metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.cpp f982bf2 
  
metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/hive_metastoreConstants.java
 5a666f2 
  metastore/src/gen/thrift/gen-php/metastore/Types.php f505208 
  metastore/src/gen/thrift/gen-py/hive_metastore/constants.py d1c07a5 
  metastore/src/gen/thrift/gen-rb/hive_metastore_constants.rb eeccc84 
  
metastore/src/java/org/apache/hadoop/hive/metastore/TransactionalValidationListener.java
 3e74675 
  orc/src/java/org/apache/orc/impl/TreeReaderFactory.java c4a2093 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FetchTask.java db6848a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java 57b6c67 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java 23a13d6 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java c150ec5 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 945b828 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 63d02fb 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java b0f8c8b 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 1a1af28 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 9d927bd 
  ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 8cf261d 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 6caca98 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java d48e441 
  ql/src/test/org/apache/hadoop/hive/ql/io/TestAcidUtils.java b83cea4 

Diff: https://reviews.apache.org/r/49766/diff/


Testing
---

Tests for the feature are in 
ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java. These are mostly 
integration tests that test end-to-end insert/update/delete scenarios followed 
by compaction and cleaning.


Thanks,

Saket Saurabh



Re: Review Request 49766: HIVE-14035 Enable predicate pushdown to delta files created by ACID Transactions

2016-07-21 Thread Lefty Leverenz

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49766/#review143059
---




common/src/java/org/apache/hadoop/hive/conf/HiveConf.java (lines 1760 - 1762)


Is it just me, or does this config description need more explanation?  
Maybe an example would help.


- Lefty Leverenz


On July 20, 2016, 10:55 p.m., Saket Saurabh wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49766/
> ---
> 
> (Updated July 20, 2016, 10:55 p.m.)
> 
> 
> Review request for hive and Eugene Koifman.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-14035
> 
> In current Hive version, delta files created by ACID transactions do not 
> allow predicate pushdown if they contain any update/delete events. This is 
> done to preserve correctness when following a multi-version approach during 
> event collapsing, where an update event overwrites an existing insert event. 
> This JIRA proposes to split an update event into a combination of a delete 
> event followed by a new insert event, that can enable predicate push down to 
> all delta files without breaking correctness. To support backward 
> compatibility for this feature, this JIRA also proposes to add some sort of 
> versioning to ACID that can allow different versions of ACID transactions to 
> co-exist together.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 66203a5 
>   
> hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FosterStorageHandler.java
>  14f7316 
>   
> hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java
>  974c6b8 
>   metastore/if/hive_metastore.thrift 4d92b73 
>   metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.h ae14bd1 
>   metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.cpp f982bf2 
>   
> metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/hive_metastoreConstants.java
>  5a666f2 
>   metastore/src/gen/thrift/gen-php/metastore/Types.php f505208 
>   metastore/src/gen/thrift/gen-py/hive_metastore/constants.py d1c07a5 
>   metastore/src/gen/thrift/gen-rb/hive_metastore_constants.rb eeccc84 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/TransactionalValidationListener.java
>  3e74675 
>   orc/src/java/org/apache/orc/impl/TreeReaderFactory.java c4a2093 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FetchTask.java db6848a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java 57b6c67 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java 23a13d6 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java c150ec5 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 945b828 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 69d58d6 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java b0f8c8b 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java e577961 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java ef0bb3d 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 8cf261d 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> 6caca98 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java d48e441 
>   ql/src/test/org/apache/hadoop/hive/ql/io/TestAcidUtils.java b83cea4 
> 
> Diff: https://reviews.apache.org/r/49766/diff/
> 
> 
> Testing
> ---
> 
> Tests for the feature are in 
> ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java. These are mostly 
> integration tests that test end-to-end insert/update/delete scenarios 
> followed by compaction and cleaning.
> 
> 
> Thanks,
> 
> Saket Saurabh
> 
>



Re: Review Request 49766: HIVE-14035 Enable predicate pushdown to delta files created by ACID Transactions

2016-07-20 Thread Saket Saurabh

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49766/
---

(Updated July 20, 2016, 3:55 p.m.)


Review request for hive and Eugene Koifman.


Changes
---

Updated the patch by rebasing with master. No additional code changes. Same as 
Patch #10 at https://issues.apache.org/jira/browse/HIVE-14035


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-14035

In current Hive version, delta files created by ACID transactions do not allow 
predicate pushdown if they contain any update/delete events. This is done to 
preserve correctness when following a multi-version approach during event 
collapsing, where an update event overwrites an existing insert event. 
This JIRA proposes to split an update event into a combination of a delete 
event followed by a new insert event, that can enable predicate push down to 
all delta files without breaking correctness. To support backward compatibility 
for this feature, this JIRA also proposes to add some sort of versioning to 
ACID that can allow different versions of ACID transactions to co-exist 
together.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 66203a5 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FosterStorageHandler.java
 14f7316 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java
 974c6b8 
  metastore/if/hive_metastore.thrift 4d92b73 
  metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.h ae14bd1 
  metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.cpp f982bf2 
  
metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/hive_metastoreConstants.java
 5a666f2 
  metastore/src/gen/thrift/gen-php/metastore/Types.php f505208 
  metastore/src/gen/thrift/gen-py/hive_metastore/constants.py d1c07a5 
  metastore/src/gen/thrift/gen-rb/hive_metastore_constants.rb eeccc84 
  
metastore/src/java/org/apache/hadoop/hive/metastore/TransactionalValidationListener.java
 3e74675 
  orc/src/java/org/apache/orc/impl/TreeReaderFactory.java c4a2093 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FetchTask.java db6848a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java 57b6c67 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java 23a13d6 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java c150ec5 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 945b828 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 69d58d6 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java b0f8c8b 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java e577961 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java ef0bb3d 
  ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 8cf261d 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 6caca98 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java d48e441 
  ql/src/test/org/apache/hadoop/hive/ql/io/TestAcidUtils.java b83cea4 

Diff: https://reviews.apache.org/r/49766/diff/


Testing
---

Tests for the feature are in 
ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java. These are mostly 
integration tests that test end-to-end insert/update/delete scenarios followed 
by compaction and cleaning.


Thanks,

Saket Saurabh



Re: Review Request 49766: HIVE-14035 Enable predicate pushdown to delta files created by ACID Transactions

2016-07-07 Thread Saket Saurabh

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49766/
---

(Updated July 7, 2016, 10:26 a.m.)


Review request for hive and Eugene Koifman.


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-14035

In current Hive version, delta files created by ACID transactions do not allow 
predicate pushdown if they contain any update/delete events. This is done to 
preserve correctness when following a multi-version approach during event 
collapsing, where an update event overwrites an existing insert event. 
This JIRA proposes to split an update event into a combination of a delete 
event followed by a new insert event, that can enable predicate push down to 
all delta files without breaking correctness. To support backward compatibility 
for this feature, this JIRA also proposes to add some sort of versioning to 
ACID that can allow different versions of ACID transactions to co-exist 
together.


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b13fc65 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FosterStorageHandler.java
 14f7316 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java
 0c6b9ea 
  metastore/if/hive_metastore.thrift 4d92b73 
  metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.h ae14bd1 
  metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.cpp f982bf2 
  
metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/hive_metastoreConstants.java
 5a666f2 
  metastore/src/gen/thrift/gen-php/metastore/Types.php f505208 
  metastore/src/gen/thrift/gen-py/hive_metastore/constants.py d1c07a5 
  metastore/src/gen/thrift/gen-rb/hive_metastore_constants.rb eeccc84 
  
metastore/src/java/org/apache/hadoop/hive/metastore/TransactionalValidationListener.java
 3e74675 
  orc/src/java/org/apache/orc/impl/TreeReaderFactory.java c4a2093 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FetchTask.java dff1815 
  ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java 57b6c67 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java 23a13d6 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 36f38f6 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 227a051 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 69d58d6 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java b0f8c8b 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java e577961 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 82abd52 
  ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 8cf261d 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 6caca98 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java e76c925 
  ql/src/test/org/apache/hadoop/hive/ql/io/TestAcidUtils.java 5745dee 

Diff: https://reviews.apache.org/r/49766/diff/


Testing (updated)
---

Tests for the feature are in 
ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java. These are mostly 
integration tests that test end-to-end insert/update/delete scenarios followed 
by compaction and cleaning.


Thanks,

Saket Saurabh



Re: Review Request 49766: HIVE-14035 Enable predicate pushdown to delta files created by ACID Transactions

2016-07-07 Thread Saket Saurabh

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49766/
---

(Updated July 7, 2016, 10:09 a.m.)


Review request for hive and Eugene Koifman.


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-14035

In current Hive version, delta files created by ACID transactions do not allow 
predicate pushdown if they contain any update/delete events. This is done to 
preserve correctness when following a multi-version approach during event 
collapsing, where an update event overwrites an existing insert event. 
This JIRA proposes to split an update event into a combination of a delete 
event followed by a new insert event, that can enable predicate push down to 
all delta files without breaking correctness. To support backward compatibility 
for this feature, this JIRA also proposes to add some sort of versioning to 
ACID that can allow different versions of ACID transactions to co-exist 
together.


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b13fc65 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FosterStorageHandler.java
 14f7316 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java
 0c6b9ea 
  metastore/if/hive_metastore.thrift 4d92b73 
  metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.h ae14bd1 
  metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.cpp f982bf2 
  
metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/hive_metastoreConstants.java
 5a666f2 
  metastore/src/gen/thrift/gen-php/metastore/Types.php f505208 
  metastore/src/gen/thrift/gen-py/hive_metastore/constants.py d1c07a5 
  metastore/src/gen/thrift/gen-rb/hive_metastore_constants.rb eeccc84 
  
metastore/src/java/org/apache/hadoop/hive/metastore/TransactionalValidationListener.java
 3e74675 
  orc/src/java/org/apache/orc/impl/TreeReaderFactory.java c4a2093 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FetchTask.java dff1815 
  ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java 57b6c67 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java 23a13d6 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 36f38f6 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 227a051 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 69d58d6 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java b0f8c8b 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java e577961 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 82abd52 
  ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 8cf261d 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 6caca98 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java e76c925 
  ql/src/test/org/apache/hadoop/hive/ql/io/TestAcidUtils.java 5745dee 

Diff: https://reviews.apache.org/r/49766/diff/


Testing
---


Thanks,

Saket Saurabh