Peter Vary created HIVE-21506:
---------------------------------
Summary: Memory based TxnHandler implementation
Key: HIVE-21506
URL: https://issues.apache.org/jira/browse/HIVE-21506
Project: Hive
Issue Type: New Feature
Components: Transactions
Reporter: Peter Vary
The current TxnHandler implementations are using the backend RDBMS to store
every Hive lock and transaction data, so multiple TxnHandler instances can run
simultaneously and can serve requests. The continuous communication/locking
done on the RDBMS side puts serious load on the backend databases also
restricts the possible throughput.
If it is possible to have only a single active TxnHandler (with the current
design HMS) instance then we can provide much better (using only java based
locking) performance. We still have to store the committed write transactions
to the RDBMS (or later some other persistent storage), but other lock and
transaction operations could remain memory only.
The most important drawbacks with this solution is that we definitely lose
scalability when one instance of TxnHandler is no longer able to serve the
requests (see NameNode), and fault tolerance in the sense that the ongoing
transactions should be terminated when the TxnHandler is failed. If this
drawbacks are acceptable in certain situations the we can provide better
throughput for the users.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)