-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72481/
-----------------------------------------------------------

Review request for hive, Denys Kuzmenko and Peter Vary.


Repository: hive-git


Description
-------

Removed global mutex on writeId allocation, which means write ids can now be 
allocated concurrently for different tables without blocking each other, 
speeding up execution (perf test results below). Concurrent 
allocateTableWriteIds() operations targeting the same table are still mutexed 
by an S4U if the table is already present in next_write_id, otherwise a race 
condition to insert the table into next_write_id is solved by retrying after 
catching the duplicate key exception (the thread which commits later will be 
the one to retry).

The situation is similar when allocateTableWriteIds() and 
replTableWriteIdState() are running concurrently - if they target different 
tables, they won't block each other anymore. If they target the same table, and 
the table is already inserted into next_write_id, replTableWriteIdState() 
returns early and allocateTableWriteIds() updates the next id. If the table is 
not yet in next_write_id, they might attempt to insert the same row 
concurrently, in which case who commits later will get a duplicate key 
exception and retry the operation, just as above.


Diffs
-----

  ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java 
868da0c7a0 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
 d59f863b11 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
 cf41ef8aaf 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnStore.java
 1e177f4a7b 


Diff: https://reviews.apache.org/r/72481/diff/1/


Testing
-------

Unit test in TestTxnHandler
+ Perf tests:
dbType    sameTable variant  ms/op  error
MYSQL     FALSE     original 46.93  3.041
MYSQL     FALSE     patched  19.283 1.311
MYSQL     TRUE      original 50.185 3.595
MYSQL     TRUE      patched  32.254 2.164
ORACLE    FALSE     original 57.609 4.461
ORACLE    FALSE     patched  25.721 2.551
ORACLE    TRUE      original 59.668 3.172
ORACLE    TRUE      patched  39.061 2.548
POSTGRES  FALSE     original 39.364 2.94 
POSTGRES  FALSE     patched  18.518 1.038
POSTGRES  TRUE      original 39.868 2.679
POSTGRES  TRUE      patched  28.874 1.768
SQLSERVER FALSE     original 45.252 1.643
SQLSERVER FALSE     patched  24.583 1.529
SQLSERVER TRUE      original 49.149 3.45 
SQLSERVER TRUE      patched  32.918 1.654
(sameTable=true means that all threads were trying to allocate ids for the same 
db.table,
false means they all targeted different tables)


Thanks,

Marton Bod

Reply via email to