----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/72481/ -----------------------------------------------------------
Review request for hive, Denys Kuzmenko and Peter Vary. Repository: hive-git Description ------- Removed global mutex on writeId allocation, which means write ids can now be allocated concurrently for different tables without blocking each other, speeding up execution (perf test results below). Concurrent allocateTableWriteIds() operations targeting the same table are still mutexed by an S4U if the table is already present in next_write_id, otherwise a race condition to insert the table into next_write_id is solved by retrying after catching the duplicate key exception (the thread which commits later will be the one to retry). The situation is similar when allocateTableWriteIds() and replTableWriteIdState() are running concurrently - if they target different tables, they won't block each other anymore. If they target the same table, and the table is already inserted into next_write_id, replTableWriteIdState() returns early and allocateTableWriteIds() updates the next id. If the table is not yet in next_write_id, they might attempt to insert the same row concurrently, in which case who commits later will get a duplicate key exception and retry the operation, just as above. Diffs ----- ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java 868da0c7a0 standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java d59f863b11 standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java cf41ef8aaf standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnStore.java 1e177f4a7b Diff: https://reviews.apache.org/r/72481/diff/1/ Testing ------- Unit test in TestTxnHandler + Perf tests: dbType sameTable variant ms/op error MYSQL FALSE original 46.93 3.041 MYSQL FALSE patched 19.283 1.311 MYSQL TRUE original 50.185 3.595 MYSQL TRUE patched 32.254 2.164 ORACLE FALSE original 57.609 4.461 ORACLE FALSE patched 25.721 2.551 ORACLE TRUE original 59.668 3.172 ORACLE TRUE patched 39.061 2.548 POSTGRES FALSE original 39.364 2.94 POSTGRES FALSE patched 18.518 1.038 POSTGRES TRUE original 39.868 2.679 POSTGRES TRUE patched 28.874 1.768 SQLSERVER FALSE original 45.252 1.643 SQLSERVER FALSE patched 24.583 1.529 SQLSERVER TRUE original 49.149 3.45 SQLSERVER TRUE patched 32.918 1.654 (sameTable=true means that all threads were trying to allocate ids for the same db.table, false means they all targeted different tables) Thanks, Marton Bod