[ https://issues.apache.org/jira/browse/HBASE-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010663#comment-14010663 ]
Jeffrey Zhong commented on HBASE-10999: --------------------------------------- [~cuijianwei] It seems the WorkerRegister hasn't fully been implemented. The current code is based on 0.94 while Hbase CP has incompatible changes since then. What's your plan to incorporate Themis into HBase: move the code into HBase code base or still put Themis as a third party library outside of HBase. If leaving Themis outside, I can image transaction will become one of core functionalities and will be hard to fix when there is any issue and also hard to correlate release cycles. [~saint....@gmail.com] What's your thoughts on this? Thanks. > Cross-row Transaction : Implement Percolator Algorithm on HBase > --------------------------------------------------------------- > > Key: HBASE-10999 > URL: https://issues.apache.org/jira/browse/HBASE-10999 > Project: HBase > Issue Type: New Feature > Components: Transactions/MVCC > Affects Versions: 0.99.0 > Reporter: cuijianwei > Assignee: cuijianwei > > Cross-row transaction is a desired function for database. It is not easy to > keep ACID characteristics of cross-row transactions in distribute databases > such as HBase, because data of cross-transaction might locate in different > machines. In the paper http://research.google.com/pubs/pub36726.html, google > presents an algorithm(named percolator) to implement cross-row transactions > on BigTable. After analyzing the algorithm, we found percolator might also be > a choice to provide cross-row transaction on HBase. The reasons includes: > 1. Percolator could keep the ACID of cross-row transaction as described in > google's paper. Percolator depends on a Global Incremental Timestamp Service > to define the order of transactions, this is important to keep ACID of > transaction. > 2. Percolator algorithm could be totally implemented in client-side. This > means we do not need to change the logic of server side. Users could easily > include percolator in their client and adopt percolator APIs only when they > want cross-row transaction. > 3. Percolator is a general algorithm which could be implemented based on > databases providing single-row transaction. Therefore, it is feasible to > implement percolator on HBase. > In last few months, we have implemented percolator on HBase, did correctness > validation, performance test and finally successfully applied this algorithm > in our production environment. Our works include: > 1. percolator algorithm implementation on HBase. The current implementations > includes: > a). a Transaction module to provides put/delete/get/scan interfaces to do > cross-row/cross-table transaction. > b). a Global Incremental Timestamp Server to provide globally > monotonically increasing timestamp for transaction. > c). a LockCleaner module to resolve conflict when concurrent transactions > mutate the same column. > d). an internal module to implement prewrite/commit/get/scan logic of > percolator. > Although percolator logic could be totally implemented in client-side, we > use coprocessor framework of HBase in our implementation. This is because > coprocessor could provide percolator-specific Rpc interfaces such as > prewrite/commit to reduce Rpc rounds and improve efficiency. Another reason > to use coprocessor is that we want to decouple percolator's code from HBase > so that users will get clean HBase code if they don't need cross-row > transactions. In future, we will also explore the concurrent running > characteristic of coprocessor to do cross-row mutations more efficiently. > 2. an AccountTransfer simulation program to validate the correctness of > implementation. This program will distribute initial values in different > tables, rows and columns in HBase. Each column represents an account. Then, > configured client threads will be concurrently started to read out a number > of account values from different tables and rows by percolator's get; after > this, clients will randomly transfer values among these accounts while > keeping the sum unchanged, which simulates concurrent cross-table/cross-row > transactions. To check the correctness of transactions, a checker thread will > periodically scan account values from all columns, make sure the current > total value is the same as the initial total value. We run this validation > program while developing, this help us correct errors of implementation. > 3. performance evaluation under various test situations. We compared > percolator's APIs with HBase's with different data size and client thread > count for single-column transaction which represents the worst performance > case for percolator. We get the performance comparison result as (below): > a) For read, the performance of percolator is 90% of HBase; > b) For write, the performance of percolator is 23% of HBase. > The drop derives from the overhead of percolator logic, the performance test > result is similar as the result reported by google's paper. > 4. Performance improvement. The write performance of percolator decreases > more compared with HBase. This is because percolator's write needs to read > data out to check write conflict and needs two Rpcs which do prewriting and > commiting respectively. We are investigating ways to improve the write > performance. > We are glad to share current percolator implementation and hope this could > provide a choice for users who want cross-row transactions because it does > not need to change the code and logic of origin HBase. Comments and > discussions are welcomed. -- This message was sent by Atlassian JIRA (v6.2#6252)