[ 
https://issues.apache.org/jira/browse/HBASE-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976279#comment-13976279
 ] 

cuijianwei commented on HBASE-10999:
------------------------------------

[~stack], thanks for your comment. Haeinsa is an interesting project to 
implement cross-row transaction on HBase. We analyzed Haeinsa's implementation 
before deciding to implement percolator algorithm. In my opinion, an important 
difference between percolator and Haeinsa is that percolator provides global 
database snapshot for read while Haeinsa always returns the data of newest 
committed transactions. If our analysis is right, the read of Haeinsa needs two 
phases. Firstly, Haeinsa needs to read back the data and locks of transaction 
rows where the data and locks will be both cached in client side. After this, 
Haeinsa needs to read back the locks of transaction rows again to check the 
locks are not changed, so that won't return incomplete transactions to users. 
The two-phase read might make Haeinsa not easy to read large volume of data for 
two reasons:a). it is not easy to cached data and locks for a large number of 
rows in client side; b) when scanning a large range of rows, newer writes have 
a greater possibility to change the locks of scanning rows which will make read 
fail more easily. On the other hand, percolator will use the a global 
incremental timestamp to define the database snapshot for read. The client will 
return the row to user if no lock conflict discovered, so that does not need to 
cache any data and lock in client side.
   The Haeinsa project does not provides global database snapshot so that it 
does not depend a Global Incremental Timestamp Service, which makes its 
implementation more independent. However, in my opinion, the global database 
snapshot is important for transactions as analyzed above; and we find it is not 
difficult to implement a Global Incremental Timestamp Service. Consequently, we 
implemented percolator algorithm to do cross-row transaction.

> Cross-row Transaction : Implement Percolator Algorithm on HBase
> ---------------------------------------------------------------
>
>                 Key: HBASE-10999
>                 URL: https://issues.apache.org/jira/browse/HBASE-10999
>             Project: HBase
>          Issue Type: New Feature
>          Components: Transactions/MVCC
>    Affects Versions: 0.99.0
>            Reporter: cuijianwei
>            Assignee: cuijianwei
>
> Cross-row transaction is a desired function for database. It is not easy to 
> keep ACID characteristics of cross-row transactions in distribute databases 
> such as HBase, because data of cross-transaction might locate in different 
> machines. In the paper http://research.google.com/pubs/pub36726.html, google 
> presents an algorithm(named percolator) to implement cross-row transactions 
> on BigTable. After analyzing the algorithm, we found percolator might also be 
> a choice to provide cross-row transaction on HBase. The reasons includes:
> 1. Percolator could keep the ACID of cross-row transaction as described in 
> google's paper. Percolator depends on a Global Incremental Timestamp Service 
> to define the order of transactions, this is important to keep ACID of 
> transaction.
> 2. Percolator algorithm could be totally implemented in client-side. This 
> means we do not need to change the logic of server side. Users could easily 
> include percolator in their client and adopt percolator APIs only when they 
> want cross-row transaction.
> 3. Percolator is a general algorithm which could be implemented based on 
> databases providing single-row transaction. Therefore, it is feasible to 
> implement percolator on HBase.
> In last few months, we have implemented percolator on HBase, did correctness 
> validation, performance test and finally successfully applied this algorithm 
> in our production environment. Our works include:
> 1. percolator algorithm implementation on HBase. The current implementations 
> includes:
>     a). a Transaction module to provides put/delete/get/scan interfaces to do 
> cross-row/cross-table transaction.
>     b). a Global Incremental Timestamp Server to provide globally 
> monotonically increasing timestamp for transaction.
>     c). a LockCleaner module to resolve conflict when concurrent transactions 
> mutate the same column.
>     d). an internal module to implement prewrite/commit/get/scan logic of 
> percolator.
>    Although percolator logic could be totally implemented in client-side, we 
> use coprocessor framework of HBase in our implementation. This is because 
> coprocessor could provide percolator-specific Rpc interfaces such as 
> prewrite/commit to reduce Rpc rounds and improve efficiency. Another reason 
> to use coprocessor is that we want to decouple percolator's code from HBase 
> so that users will get clean HBase code if they don't need cross-row 
> transactions. In future, we will also explore the concurrent running 
> characteristic of coprocessor to do cross-row mutations more efficiently.
> 2. an AccountTransfer simulation program to validate the correctness of 
> implementation. This program will distribute initial values in different 
> tables, rows and columns in HBase. Each column represents an account. Then, 
> configured client threads will be concurrently started to read out a number 
> of account values from different tables and rows by percolator's get; after 
> this, clients will randomly transfer values among these accounts while 
> keeping the sum unchanged, which simulates concurrent cross-table/cross-row 
> transactions. To check the correctness of transactions, a checker thread will 
> periodically scan account values from all columns, make sure the current 
> total value is the same as the initial total value. We run this validation 
> program while developing, this help us correct errors of implementation.
> 3. performance evaluation under various test situations. We compared 
> percolator's APIs with HBase's with different data size and client thread 
> count for single-column transaction which represents the worst performance 
> case for percolator. We get the performance comparison result as (below):
>     a) For read, the performance of percolator is 90% of HBase;
>     b) For write, the performance of percolator is 23%  of HBase.
> The drop derives from the overhead of percolator logic, the performance test 
> result is similar as the result reported by google's paper.
> 4. Performance improvement. The write performance of percolator decreases 
> more compared with HBase. This is because percolator's write needs to read 
> data out to check write conflict and needs two Rpcs which do prewriting and 
> commiting respectively. We are investigating ways to improve the write 
> performance.
> We are glad to share current percolator implementation and hope this could 
> provide a choice for users who want cross-row transactions because it does 
> not need to change the code and logic of origin HBase. Comments and 
> discussions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to