Re: Themis : implements cross-row/corss-table transaction on HBase.
Thanks for updating the list with the nice Themis updates Jianwei. I added Themis to the powered by list (and the other missing transactions managers, Tephra and Haeinsa). St.Ack On Mon, Nov 10, 2014 at 12:50 AM, 崔建伟 cuijian...@xiaomi.com wrote: Hi everyone: In last few months, we have updated Themis to achieve better performance and include more features: 1. Improve the single-row write performance from 23%(relative drop compared with HBase's put) to 60%(for most test cases). For single-row write transaction, we only write lock to MemStore in prewrite-phase, then, we erase corresponding lock, write data and commit information to HLog in commit-phase. This won't break the correctness of percolator algorithm and will help improve the performance a lot for single-row write. 2. Support HBase 0.98. We create a branch: https://github.com/XiaoMi/themis/tree/for_hbase_0.98 to make themis support HBase 0.98(Currently, support HBase 0.98.5). All the functions of master branch will also be implemented in this branch. 3. Transaction TTL support and Old Data Clean. Users could set TTL for read/write transaction respectively. Then, old data which could not be read will be cleaned periodically. 4. MapReduce Support. We implement a group of classes to support read data by themis transaction in Mapper job and write data by themis transaction in Reduce job. For more details, please see the github: https://github.com/XiaoMi/themis(or https://github.com/XiaoMi/themis/tree/for_hbase_0.98) or jira: https://issues.apache.org/jira/browse/HBASE-10999 . If you find Themis interesting, please leave us comment in the mail, jira or github. Best cuijianwei From: saint@gmail.com saint@gmail.com on behalf of Stack st...@duboce.net Sent: Sunday, July 13, 2014 1:12 PM To: HBase Dev List Cc: user@hbase.apache.org Subject: Re: Themis : implements cross-row/corss-table transaction on HBase. On Tue, Jul 8, 2014 at 12:34 AM, 崔建伟 cuijian...@xiaomi.com wrote: Hi everyone, I want to introduce our open-source project Themis which implements cross-row/corss-table transaction on HBase. Themis follows google's percolator algorithm( http://research.google.com/pubs/pub36726.html), which provides ACID-compliant transaction and snapshot isolation. The cross-row transaction is based on HBase's single-row atomic semantics and doesn't use a central transaction server, so that supports linear-scalability. Themis depends on a timestamp server to provides global strictly incremental timestamp to define the order of transactions, which will be used to resolve the write-write and read-write conflicts. The timestamp server is lightweight and could achieve hight throughput(500, 000 + qps), and Themis will batch timestamp requests across transactions in one Rpc, so that it won't become the bottleneck of the system even when processing billions of transactions every day. Although Themis could be implemented totally in client-side, we adopt coprocessor framework of HBase to achieve higher performance. Themis includes a client-side library to provides transaction APIs, such as themisPut/themisGet/themisScan/themisDelete, and a coprocessor library loaded on regionserver. Therefore, Themis could be used without changing the code and logic of HBase. We have been validating the correctness of Themis for a few months by a AccountTransfer simulation program, which concurrently does cross-row transactions by transferring money among different accounts(each account is a row in HBase) and verifies total money of all accounts doesn't change in the simulation. We have also run Themis on our production environment. We test the performance of Themis and get comparable result as percolator. The single-column transaction represents the worst performance case for Themis compared with HBase, the result is: 1) For read, the performance of percolator is 90% of HBase; 2) For write, the performance of percolator is 23% of HBase. The write performance drops a lot because Themis uses two-phase commit protocol to achieve ACID of transaction. For multi-row write, we improve the performance by paralleling all writes of pre-write phase. For single-row write, we are optimizing two-phase commit protocol to achieve better performance and will update the result when it is ready. The details of performance result could be found in github. The repository and introduction of Themis include: 1. Themis github: https://github.com/XiaoMi/themis/. The source code, performance test result and user guide could be found here. 2. Themis jira : https://issues.apache.org/jira/browse/HBASE-10999 3. Chronos github: https://github.com/XiaoMi/chronos. Chronos is our open-source high-availability, high-performance timestamp server to provide global strictly incremental timestamp for Themis. If you find Themis
RE: Themis : implements cross-row/corss-table transaction on HBase.
Hi Stack: Thanks for your concern and work:). I will continue to improve Themis and report the new progress. Best jianwei From: saint@gmail.com saint@gmail.com on behalf of Stack st...@duboce.net Sent: Wednesday, November 12, 2014 12:15 AM To: HBase Dev List Cc: user@hbase.apache.org Subject: Re: Themis : implements cross-row/corss-table transaction on HBase. Thanks for updating the list with the nice Themis updates Jianwei. I added Themis to the powered by list (and the other missing transactions managers, Tephra and Haeinsa). St.Ack On Mon, Nov 10, 2014 at 12:50 AM, 崔建伟 cuijian...@xiaomi.com wrote: Hi everyone: In last few months, we have updated Themis to achieve better performance and include more features: 1. Improve the single-row write performance from 23%(relative drop compared with HBase's put) to 60%(for most test cases). For single-row write transaction, we only write lock to MemStore in prewrite-phase, then, we erase corresponding lock, write data and commit information to HLog in commit-phase. This won't break the correctness of percolator algorithm and will help improve the performance a lot for single-row write. 2. Support HBase 0.98. We create a branch: https://github.com/XiaoMi/themis/tree/for_hbase_0.98 to make themis support HBase 0.98(Currently, support HBase 0.98.5). All the functions of master branch will also be implemented in this branch. 3. Transaction TTL support and Old Data Clean. Users could set TTL for read/write transaction respectively. Then, old data which could not be read will be cleaned periodically. 4. MapReduce Support. We implement a group of classes to support read data by themis transaction in Mapper job and write data by themis transaction in Reduce job. For more details, please see the github: https://github.com/XiaoMi/themis(or https://github.com/XiaoMi/themis/tree/for_hbase_0.98) or jira: https://issues.apache.org/jira/browse/HBASE-10999 . If you find Themis interesting, please leave us comment in the mail, jira or github. Best cuijianwei From: saint@gmail.com saint@gmail.com on behalf of Stack st...@duboce.net Sent: Sunday, July 13, 2014 1:12 PM To: HBase Dev List Cc: user@hbase.apache.org Subject: Re: Themis : implements cross-row/corss-table transaction on HBase. On Tue, Jul 8, 2014 at 12:34 AM, 崔建伟 cuijian...@xiaomi.com wrote: Hi everyone, I want to introduce our open-source project Themis which implements cross-row/corss-table transaction on HBase. Themis follows google's percolator algorithm( http://research.google.com/pubs/pub36726.html), which provides ACID-compliant transaction and snapshot isolation. The cross-row transaction is based on HBase's single-row atomic semantics and doesn't use a central transaction server, so that supports linear-scalability. Themis depends on a timestamp server to provides global strictly incremental timestamp to define the order of transactions, which will be used to resolve the write-write and read-write conflicts. The timestamp server is lightweight and could achieve hight throughput(500, 000 + qps), and Themis will batch timestamp requests across transactions in one Rpc, so that it won't become the bottleneck of the system even when processing billions of transactions every day. Although Themis could be implemented totally in client-side, we adopt coprocessor framework of HBase to achieve higher performance. Themis includes a client-side library to provides transaction APIs, such as themisPut/themisGet/themisScan/themisDelete, and a coprocessor library loaded on regionserver. Therefore, Themis could be used without changing the code and logic of HBase. We have been validating the correctness of Themis for a few months by a AccountTransfer simulation program, which concurrently does cross-row transactions by transferring money among different accounts(each account is a row in HBase) and verifies total money of all accounts doesn't change in the simulation. We have also run Themis on our production environment. We test the performance of Themis and get comparable result as percolator. The single-column transaction represents the worst performance case for Themis compared with HBase, the result is: 1) For read, the performance of percolator is 90% of HBase; 2) For write, the performance of percolator is 23% of HBase. The write performance drops a lot because Themis uses two-phase commit protocol to achieve ACID of transaction. For multi-row write, we improve the performance by paralleling all writes of pre-write phase. For single-row write, we are optimizing two-phase commit protocol to achieve better performance and will update the result when it is ready. The details of performance result could be found in github. The repository and introduction of Themis include: 1. Themis
RE: Themis : implements cross-row/corss-table transaction on HBase.
Hi Ted: thanks for your feedback. I tried to clone for_hbase_0.98 branch by: git clone https://github.com/XiaoMi/themis/tree/for_hbase_0.98;, it returns: error: The requested URL returned error: 403 while accessing https://github.com/XiaoMi/themis/tree/for_hbase_0.98/info/refs;. I think we can clone themis by: git clone https://github.com/XiaoMi/themis; firstly. Then, view the remote for_hbase_0.98 branch by: git branch -r and get: origin/HEAD - origin/master origin/for_hbase_0.98 origin/master And then checkout remote origin/for_hbase_0.98 to local as: git checkout -b for_hbase_0.98 origin/for_hbase_0.98 Thanks. Best cuijianwei From: Ted Yu yuzhih...@gmail.com Sent: Monday, November 10, 2014 11:39 PM To: d...@hbase.apache.org Cc: user@hbase.apache.org Subject: Re: Themis : implements cross-row/corss-table transaction on HBase. Jianwei: I used this command to clone your repo: git clone https://github.com/XiaoMi/themis/tree/for_hbase_0.98 themis But I only found 0.94 being referenced in the pom.xml files: $ find . -name pom.xml -exec grep '0.94.' {} \; -print hbase.version0.94.21/hbase.version ./themis-client/pom.xml hbase.version0.94.21/hbase.version ./themis-coprocessor/pom.xml hbase.version0.94.21/hbase.version ./themis-index/pom.xml Did I miss something ? Cheers On Mon, Nov 10, 2014 at 12:50 AM, 崔建伟 cuijian...@xiaomi.com wrote: Hi everyone: In last few months, we have updated Themis to achieve better performance and include more features: 1. Improve the single-row write performance from 23%(relative drop compared with HBase's put) to 60%(for most test cases). For single-row write transaction, we only write lock to MemStore in prewrite-phase, then, we erase corresponding lock, write data and commit information to HLog in commit-phase. This won't break the correctness of percolator algorithm and will help improve the performance a lot for single-row write. 2. Support HBase 0.98. We create a branch: https://github.com/XiaoMi/themis/tree/for_hbase_0.98 to make themis support HBase 0.98(Currently, support HBase 0.98.5). All the functions of master branch will also be implemented in this branch. 3. Transaction TTL support and Old Data Clean. Users could set TTL for read/write transaction respectively. Then, old data which could not be read will be cleaned periodically. 4. MapReduce Support. We implement a group of classes to support read data by themis transaction in Mapper job and write data by themis transaction in Reduce job. For more details, please see the github: https://github.com/XiaoMi/themis(or https://github.com/XiaoMi/themis/tree/for_hbase_0.98) or jira: https://issues.apache.org/jira/browse/HBASE-10999 . If you find Themis interesting, please leave us comment in the mail, jira or github. Best cuijianwei From: saint@gmail.com saint@gmail.com on behalf of Stack st...@duboce.net Sent: Sunday, July 13, 2014 1:12 PM To: HBase Dev List Cc: user@hbase.apache.org Subject: Re: Themis : implements cross-row/corss-table transaction on HBase. On Tue, Jul 8, 2014 at 12:34 AM, 崔建伟 cuijian...@xiaomi.com wrote: Hi everyone, I want to introduce our open-source project Themis which implements cross-row/corss-table transaction on HBase. Themis follows google's percolator algorithm( http://research.google.com/pubs/pub36726.html), which provides ACID-compliant transaction and snapshot isolation. The cross-row transaction is based on HBase's single-row atomic semantics and doesn't use a central transaction server, so that supports linear-scalability. Themis depends on a timestamp server to provides global strictly incremental timestamp to define the order of transactions, which will be used to resolve the write-write and read-write conflicts. The timestamp server is lightweight and could achieve hight throughput(500, 000 + qps), and Themis will batch timestamp requests across transactions in one Rpc, so that it won't become the bottleneck of the system even when processing billions of transactions every day. Although Themis could be implemented totally in client-side, we adopt coprocessor framework of HBase to achieve higher performance. Themis includes a client-side library to provides transaction APIs, such as themisPut/themisGet/themisScan/themisDelete, and a coprocessor library loaded on regionserver. Therefore, Themis could be used without changing the code and logic of HBase. We have been validating the correctness of Themis for a few months by a AccountTransfer simulation program, which concurrently does cross-row transactions by transferring money among different accounts(each account is a row in HBase) and verifies total money of all accounts doesn't change in the simulation. We have also run Themis on our production environment. We test the performance of Themis
Re: Themis : implements cross-row/corss-table transaction on HBase.
Thanks Jianwei for the suggestion. I checked out for_hbase_0.98 but got the following error when building: [ERROR] Failed to execute goal on project themis-protocol: Could not resolve dependencies for project com.xiaomi.infra:themis-protocol:jar:1.0-SNAPSHOT: Could not find artifact org.apache.hbase:hbase-protocol:jar:0.98.5 in central ( http://repo.maven.apache.org/maven2) - [Help 1] I think this is due to the fact that 0.98 is released with two profiles: one for hadoop-1 and one for hadoop-2 Currently each module of themis defines hbase version as: $ find . -name 'pom.xml' -exec grep '0.98' {} \; -print hbase.version0.98.5/hbase.version ./themis-client/pom.xml hbase.version0.98.5/hbase.version ./themis-coprocessor/pom.xml hbase.version0.98.5/hbase.version ./themis-index/pom.xml hbase.version0.98.5/hbase.version ./themis-protocol/pom.xml I tried adding -Dhbase.version=0.98.7-hadoop2 to maven command line but still got same error. You can add a property to parent pom.xml which defines hbase version so that child pom.xml can reference. That way, it is easy to override on command line. Cheers On Mon, Nov 10, 2014 at 5:13 PM, 崔建伟 cuijian...@xiaomi.com wrote: Hi Ted: thanks for your feedback. I tried to clone for_hbase_0.98 branch by: git clone https://github.com/XiaoMi/themis/tree/for_hbase_0.98;, it returns: error: The requested URL returned error: 403 while accessing https://github.com/XiaoMi/themis/tree/for_hbase_0.98/info/refs;. I think we can clone themis by: git clone https://github.com/XiaoMi/themis; firstly. Then, view the remote for_hbase_0.98 branch by: git branch -r and get: origin/HEAD - origin/master origin/for_hbase_0.98 origin/master And then checkout remote origin/for_hbase_0.98 to local as: git checkout -b for_hbase_0.98 origin/for_hbase_0.98 Thanks. Best cuijianwei From: Ted Yu yuzhih...@gmail.com Sent: Monday, November 10, 2014 11:39 PM To: d...@hbase.apache.org Cc: user@hbase.apache.org Subject: Re: Themis : implements cross-row/corss-table transaction on HBase. Jianwei: I used this command to clone your repo: git clone https://github.com/XiaoMi/themis/tree/for_hbase_0.98 themis But I only found 0.94 being referenced in the pom.xml files: $ find . -name pom.xml -exec grep '0.94.' {} \; -print hbase.version0.94.21/hbase.version ./themis-client/pom.xml hbase.version0.94.21/hbase.version ./themis-coprocessor/pom.xml hbase.version0.94.21/hbase.version ./themis-index/pom.xml Did I miss something ? Cheers On Mon, Nov 10, 2014 at 12:50 AM, 崔建伟 cuijian...@xiaomi.com wrote: Hi everyone: In last few months, we have updated Themis to achieve better performance and include more features: 1. Improve the single-row write performance from 23%(relative drop compared with HBase's put) to 60%(for most test cases). For single-row write transaction, we only write lock to MemStore in prewrite-phase, then, we erase corresponding lock, write data and commit information to HLog in commit-phase. This won't break the correctness of percolator algorithm and will help improve the performance a lot for single-row write. 2. Support HBase 0.98. We create a branch: https://github.com/XiaoMi/themis/tree/for_hbase_0.98 to make themis support HBase 0.98(Currently, support HBase 0.98.5). All the functions of master branch will also be implemented in this branch. 3. Transaction TTL support and Old Data Clean. Users could set TTL for read/write transaction respectively. Then, old data which could not be read will be cleaned periodically. 4. MapReduce Support. We implement a group of classes to support read data by themis transaction in Mapper job and write data by themis transaction in Reduce job. For more details, please see the github: https://github.com/XiaoMi/themis(or https://github.com/XiaoMi/themis/tree/for_hbase_0.98) or jira: https://issues.apache.org/jira/browse/HBASE-10999 . If you find Themis interesting, please leave us comment in the mail, jira or github. Best cuijianwei From: saint@gmail.com saint@gmail.com on behalf of Stack st...@duboce.net Sent: Sunday, July 13, 2014 1:12 PM To: HBase Dev List Cc: user@hbase.apache.org Subject: Re: Themis : implements cross-row/corss-table transaction on HBase. On Tue, Jul 8, 2014 at 12:34 AM, 崔建伟 cuijian...@xiaomi.com wrote: Hi everyone, I want to introduce our open-source project Themis which implements cross-row/corss-table transaction on HBase. Themis follows google's percolator algorithm( http://research.google.com/pubs/pub36726.html), which provides ACID-compliant transaction and snapshot isolation. The cross-row transaction is based on HBase's single-row atomic semantics and doesn't use a central transaction server, so that supports linear-scalability
Re: Themis : implements cross-row/corss-table transaction on HBase.
bq. do we have hadoop2 version for hbase 0.94 in central maven repository? Not that I know of. w.r.t. dependency on hbase-annotations module, maybe you can introduce a new profile that includes this dependency. Cheers On Mon, Nov 10, 2014 at 7:18 PM, 崔建伟 cuijian...@xiaomi.com wrote: Hi Ted: Thanks for your advice. Yes, it is more reasonable to add common properties in parent pom:). I update themis to set hbase version/hadoop version in parent pom, and set hbase version to 0.98.5-hadoop2 for for_hbase_0.98 branch. The unit tests are passed under these modifications. When using 0.98.7-hadoop2 in for_hbase_0.98 branch, there are some compile errors(missing MediumTests class), we cam add the following dependencies to the pom of themis-client to include MediumTests class. dependency groupIdorg.apache.hbase/groupId artifactIdhbase-annotations/artifactId version${hbase.version}/version classifiertests/classifier scopetest/scope /dependency There are some common dependencies among sub-projects, I will polish pom to add common dependencies to parent pom. BTW, do we have hadoop2 version for hbase 0.94 in central maven repository? Best cuijianwei From: Ted Yu yuzhih...@gmail.com Sent: Tuesday, November 11, 2014 9:43 AM To: d...@hbase.apache.org Cc: user@hbase.apache.org Subject: Re: Themis : implements cross-row/corss-table transaction on HBase. Thanks Jianwei for the suggestion. I checked out for_hbase_0.98 but got the following error when building: [ERROR] Failed to execute goal on project themis-protocol: Could not resolve dependencies for project com.xiaomi.infra:themis-protocol:jar:1.0-SNAPSHOT: Could not find artifact org.apache.hbase:hbase-protocol:jar:0.98.5 in central ( http://repo.maven.apache.org/maven2) - [Help 1] I think this is due to the fact that 0.98 is released with two profiles: one for hadoop-1 and one for hadoop-2 Currently each module of themis defines hbase version as: $ find . -name 'pom.xml' -exec grep '0.98' {} \; -print hbase.version0.98.5/hbase.version ./themis-client/pom.xml hbase.version0.98.5/hbase.version ./themis-coprocessor/pom.xml hbase.version0.98.5/hbase.version ./themis-index/pom.xml hbase.version0.98.5/hbase.version ./themis-protocol/pom.xml I tried adding -Dhbase.version=0.98.7-hadoop2 to maven command line but still got same error. You can add a property to parent pom.xml which defines hbase version so that child pom.xml can reference. That way, it is easy to override on command line. Cheers On Mon, Nov 10, 2014 at 5:13 PM, 崔建伟 cuijian...@xiaomi.com wrote: Hi Ted: thanks for your feedback. I tried to clone for_hbase_0.98 branch by: git clone https://github.com/XiaoMi/themis/tree/for_hbase_0.98;, it returns: error: The requested URL returned error: 403 while accessing https://github.com/XiaoMi/themis/tree/for_hbase_0.98/info/refs;. I think we can clone themis by: git clone https://github.com/XiaoMi/themis; firstly. Then, view the remote for_hbase_0.98 branch by: git branch -r and get: origin/HEAD - origin/master origin/for_hbase_0.98 origin/master And then checkout remote origin/for_hbase_0.98 to local as: git checkout -b for_hbase_0.98 origin/for_hbase_0.98 Thanks. Best cuijianwei From: Ted Yu yuzhih...@gmail.com Sent: Monday, November 10, 2014 11:39 PM To: d...@hbase.apache.org Cc: user@hbase.apache.org Subject: Re: Themis : implements cross-row/corss-table transaction on HBase. Jianwei: I used this command to clone your repo: git clone https://github.com/XiaoMi/themis/tree/for_hbase_0.98 themis But I only found 0.94 being referenced in the pom.xml files: $ find . -name pom.xml -exec grep '0.94.' {} \; -print hbase.version0.94.21/hbase.version ./themis-client/pom.xml hbase.version0.94.21/hbase.version ./themis-coprocessor/pom.xml hbase.version0.94.21/hbase.version ./themis-index/pom.xml Did I miss something ? Cheers On Mon, Nov 10, 2014 at 12:50 AM, 崔建伟 cuijian...@xiaomi.com wrote: Hi everyone: In last few months, we have updated Themis to achieve better performance and include more features: 1. Improve the single-row write performance from 23%(relative drop compared with HBase's put) to 60%(for most test cases). For single-row write transaction, we only write lock to MemStore in prewrite-phase, then, we erase corresponding lock, write data and commit information to HLog in commit-phase. This won't break the correctness of percolator algorithm and will help improve the performance a lot for single-row write. 2. Support HBase 0.98. We create a branch: https://github.com/XiaoMi/themis/tree/for_hbase_0.98 to make themis support HBase 0.98(Currently, support HBase 0.98.5). All the functions
RE: Themis : implements cross-row/corss-table transaction on HBase.
Hi Ted: Thanks for your advice. Yes, it is more reasonable to add common properties in parent pom:). I update themis to set hbase version/hadoop version in parent pom, and set hbase version to 0.98.5-hadoop2 for for_hbase_0.98 branch. The unit tests are passed under these modifications. When using 0.98.7-hadoop2 in for_hbase_0.98 branch, there are some compile errors(missing MediumTests class), we cam add the following dependencies to the pom of themis-client to include MediumTests class. dependency groupIdorg.apache.hbase/groupId artifactIdhbase-annotations/artifactId version${hbase.version}/version classifiertests/classifier scopetest/scope /dependency There are some common dependencies among sub-projects, I will polish pom to add common dependencies to parent pom. BTW, do we have hadoop2 version for hbase 0.94 in central maven repository? Best cuijianwei From: Ted Yu yuzhih...@gmail.com Sent: Tuesday, November 11, 2014 9:43 AM To: d...@hbase.apache.org Cc: user@hbase.apache.org Subject: Re: Themis : implements cross-row/corss-table transaction on HBase. Thanks Jianwei for the suggestion. I checked out for_hbase_0.98 but got the following error when building: [ERROR] Failed to execute goal on project themis-protocol: Could not resolve dependencies for project com.xiaomi.infra:themis-protocol:jar:1.0-SNAPSHOT: Could not find artifact org.apache.hbase:hbase-protocol:jar:0.98.5 in central ( http://repo.maven.apache.org/maven2) - [Help 1] I think this is due to the fact that 0.98 is released with two profiles: one for hadoop-1 and one for hadoop-2 Currently each module of themis defines hbase version as: $ find . -name 'pom.xml' -exec grep '0.98' {} \; -print hbase.version0.98.5/hbase.version ./themis-client/pom.xml hbase.version0.98.5/hbase.version ./themis-coprocessor/pom.xml hbase.version0.98.5/hbase.version ./themis-index/pom.xml hbase.version0.98.5/hbase.version ./themis-protocol/pom.xml I tried adding -Dhbase.version=0.98.7-hadoop2 to maven command line but still got same error. You can add a property to parent pom.xml which defines hbase version so that child pom.xml can reference. That way, it is easy to override on command line. Cheers On Mon, Nov 10, 2014 at 5:13 PM, 崔建伟 cuijian...@xiaomi.com wrote: Hi Ted: thanks for your feedback. I tried to clone for_hbase_0.98 branch by: git clone https://github.com/XiaoMi/themis/tree/for_hbase_0.98;, it returns: error: The requested URL returned error: 403 while accessing https://github.com/XiaoMi/themis/tree/for_hbase_0.98/info/refs;. I think we can clone themis by: git clone https://github.com/XiaoMi/themis; firstly. Then, view the remote for_hbase_0.98 branch by: git branch -r and get: origin/HEAD - origin/master origin/for_hbase_0.98 origin/master And then checkout remote origin/for_hbase_0.98 to local as: git checkout -b for_hbase_0.98 origin/for_hbase_0.98 Thanks. Best cuijianwei From: Ted Yu yuzhih...@gmail.com Sent: Monday, November 10, 2014 11:39 PM To: d...@hbase.apache.org Cc: user@hbase.apache.org Subject: Re: Themis : implements cross-row/corss-table transaction on HBase. Jianwei: I used this command to clone your repo: git clone https://github.com/XiaoMi/themis/tree/for_hbase_0.98 themis But I only found 0.94 being referenced in the pom.xml files: $ find . -name pom.xml -exec grep '0.94.' {} \; -print hbase.version0.94.21/hbase.version ./themis-client/pom.xml hbase.version0.94.21/hbase.version ./themis-coprocessor/pom.xml hbase.version0.94.21/hbase.version ./themis-index/pom.xml Did I miss something ? Cheers On Mon, Nov 10, 2014 at 12:50 AM, 崔建伟 cuijian...@xiaomi.com wrote: Hi everyone: In last few months, we have updated Themis to achieve better performance and include more features: 1. Improve the single-row write performance from 23%(relative drop compared with HBase's put) to 60%(for most test cases). For single-row write transaction, we only write lock to MemStore in prewrite-phase, then, we erase corresponding lock, write data and commit information to HLog in commit-phase. This won't break the correctness of percolator algorithm and will help improve the performance a lot for single-row write. 2. Support HBase 0.98. We create a branch: https://github.com/XiaoMi/themis/tree/for_hbase_0.98 to make themis support HBase 0.98(Currently, support HBase 0.98.5). All the functions of master branch will also be implemented in this branch. 3. Transaction TTL support and Old Data Clean. Users could set TTL for read/write transaction respectively. Then, old data which could not be read will be cleaned periodically. 4. MapReduce Support. We implement a group of classes to support read data by themis transaction in Mapper job and write data by themis transaction
RE: Themis : implements cross-row/corss-table transaction on HBase.
Hi Ted: thanks for your concern. Good catch on ThemisScan, I have updated this comment. It is important to support 0.96+, we are planning to do the work. It is reasonable to support append / increment as part of the transaction. As generic client transaction API of HBase has being proposed in HBASE-11447, we will refer this jira and make themis support more APIs. Best cuijianwei From: Ted Yu [yuzhih...@gmail.com] Sent: Wednesday, July 09, 2014 12:10 AM To: d...@hbase.apache.org Cc: user@hbase.apache.org Subject: Re: Themis : implements cross-row/corss-table transaction on HBase. Jianwei: You may want to update the comment for ThemisScan : //a wrapper class of Put in HBase which not expose timestamp to user public class ThemisScan extends ThemisRead { Is there plan to support append / increment as part of the transaction ? Currently Themis depends on 0.94.11 Is there plan to support 0.96+ releases ? Thanks On Tue, Jul 8, 2014 at 12:34 AM, 崔建伟 cuijian...@xiaomi.com wrote: Hi everyone, I want to introduce our open-source project Themis which implements cross-row/corss-table transaction on HBase. Themis follows google's percolator algorithm( http://research.google.com/pubs/pub36726.html), which provides ACID-compliant transaction and snapshot isolation. The cross-row transaction is based on HBase's single-row atomic semantics and doesn't use a central transaction server, so that supports linear-scalability. Themis depends on a timestamp server to provides global strictly incremental timestamp to define the order of transactions, which will be used to resolve the write-write and read-write conflicts. The timestamp server is lightweight and could achieve hight throughput(500, 000 + qps), and Themis will batch timestamp requests across transactions in one Rpc, so that it won't become the bottleneck of the system even when processing billions of transactions every day. Although Themis could be implemented totally in client-side, we adopt coprocessor framework of HBase to achieve higher performance. Themis includes a client-side library to provides transaction APIs, such as themisPut/themisGet/themisScan/themisDelete, and a coprocessor library loaded on regionserver. Therefore, Themis could be used without changing the code and logic of HBase. We have been validating the correctness of Themis for a few months by a AccountTransfer simulation program, which concurrently does cross-row transactions by transferring money among different accounts(each account is a row in HBase) and verifies total money of all accounts doesn't change in the simulation. We have also run Themis on our production environment. We test the performance of Themis and get comparable result as percolator. The single-column transaction represents the worst performance case for Themis compared with HBase, the result is: 1) For read, the performance of percolator is 90% of HBase; 2) For write, the performance of percolator is 23% of HBase. The write performance drops a lot because Themis uses two-phase commit protocol to achieve ACID of transaction. For multi-row write, we improve the performance by paralleling all writes of pre-write phase. For single-row write, we are optimizing two-phase commit protocol to achieve better performance and will update the result when it is ready. The details of performance result could be found in github. The repository and introduction of Themis include: 1. Themis github: https://github.com/XiaoMi/themis/. The source code, performance test result and user guide could be found here. 2. Themis jira : https://issues.apache.org/jira/browse/HBASE-10999 3. Chronos github: https://github.com/XiaoMi/chronos. Chronos is our open-source high-availability, high-performance timestamp server to provide global strictly incremental timestamp for Themis. If you find Themis interesting, please leave us comment in the mail, jira or github. Best cuijianwei
Re: Themis : implements cross-row/corss-table transaction on HBase.
On Tue, Jul 8, 2014 at 12:34 AM, 崔建伟 cuijian...@xiaomi.com wrote: Hi everyone, I want to introduce our open-source project Themis which implements cross-row/corss-table transaction on HBase. Themis follows google's percolator algorithm( http://research.google.com/pubs/pub36726.html), which provides ACID-compliant transaction and snapshot isolation. The cross-row transaction is based on HBase's single-row atomic semantics and doesn't use a central transaction server, so that supports linear-scalability. Themis depends on a timestamp server to provides global strictly incremental timestamp to define the order of transactions, which will be used to resolve the write-write and read-write conflicts. The timestamp server is lightweight and could achieve hight throughput(500, 000 + qps), and Themis will batch timestamp requests across transactions in one Rpc, so that it won't become the bottleneck of the system even when processing billions of transactions every day. Although Themis could be implemented totally in client-side, we adopt coprocessor framework of HBase to achieve higher performance. Themis includes a client-side library to provides transaction APIs, such as themisPut/themisGet/themisScan/themisDelete, and a coprocessor library loaded on regionserver. Therefore, Themis could be used without changing the code and logic of HBase. We have been validating the correctness of Themis for a few months by a AccountTransfer simulation program, which concurrently does cross-row transactions by transferring money among different accounts(each account is a row in HBase) and verifies total money of all accounts doesn't change in the simulation. We have also run Themis on our production environment. We test the performance of Themis and get comparable result as percolator. The single-column transaction represents the worst performance case for Themis compared with HBase, the result is: 1) For read, the performance of percolator is 90% of HBase; 2) For write, the performance of percolator is 23% of HBase. The write performance drops a lot because Themis uses two-phase commit protocol to achieve ACID of transaction. For multi-row write, we improve the performance by paralleling all writes of pre-write phase. For single-row write, we are optimizing two-phase commit protocol to achieve better performance and will update the result when it is ready. The details of performance result could be found in github. The repository and introduction of Themis include: 1. Themis github: https://github.com/XiaoMi/themis/. The source code, performance test result and user guide could be found here. 2. Themis jira : https://issues.apache.org/jira/browse/HBASE-10999 3. Chronos github: https://github.com/XiaoMi/chronos. Chronos is our open-source high-availability, high-performance timestamp server to provide global strictly incremental timestamp for Themis. If you find Themis interesting, please leave us comment in the mail, jira or github. Best cuijianwei Excellent. Nice work lads! St.Ack
Re: Themis : implements cross-row/corss-table transaction on HBase.
Jianwei: w.r.t. client transaction API, I am looking forward to your comment on HBASE-11447. It would be nice if the proposal can fit themis. Cheers On Wed, Jul 9, 2014 at 6:40 AM, 崔建伟 cuijian...@xiaomi.com wrote: Hi Ted: thanks for your concern. Good catch on ThemisScan, I have updated this comment. It is important to support 0.96+, we are planning to do the work. It is reasonable to support append / increment as part of the transaction. As generic client transaction API of HBase has being proposed in HBASE-11447, we will refer this jira and make themis support more APIs. Best cuijianwei From: Ted Yu [yuzhih...@gmail.com] Sent: Wednesday, July 09, 2014 12:10 AM To: d...@hbase.apache.org Cc: user@hbase.apache.org Subject: Re: Themis : implements cross-row/corss-table transaction on HBase. Jianwei: You may want to update the comment for ThemisScan : //a wrapper class of Put in HBase which not expose timestamp to user public class ThemisScan extends ThemisRead { Is there plan to support append / increment as part of the transaction ? Currently Themis depends on 0.94.11 Is there plan to support 0.96+ releases ? Thanks On Tue, Jul 8, 2014 at 12:34 AM, 崔建伟 cuijian...@xiaomi.com wrote: Hi everyone, I want to introduce our open-source project Themis which implements cross-row/corss-table transaction on HBase. Themis follows google's percolator algorithm( http://research.google.com/pubs/pub36726.html), which provides ACID-compliant transaction and snapshot isolation. The cross-row transaction is based on HBase's single-row atomic semantics and doesn't use a central transaction server, so that supports linear-scalability. Themis depends on a timestamp server to provides global strictly incremental timestamp to define the order of transactions, which will be used to resolve the write-write and read-write conflicts. The timestamp server is lightweight and could achieve hight throughput(500, 000 + qps), and Themis will batch timestamp requests across transactions in one Rpc, so that it won't become the bottleneck of the system even when processing billions of transactions every day. Although Themis could be implemented totally in client-side, we adopt coprocessor framework of HBase to achieve higher performance. Themis includes a client-side library to provides transaction APIs, such as themisPut/themisGet/themisScan/themisDelete, and a coprocessor library loaded on regionserver. Therefore, Themis could be used without changing the code and logic of HBase. We have been validating the correctness of Themis for a few months by a AccountTransfer simulation program, which concurrently does cross-row transactions by transferring money among different accounts(each account is a row in HBase) and verifies total money of all accounts doesn't change in the simulation. We have also run Themis on our production environment. We test the performance of Themis and get comparable result as percolator. The single-column transaction represents the worst performance case for Themis compared with HBase, the result is: 1) For read, the performance of percolator is 90% of HBase; 2) For write, the performance of percolator is 23% of HBase. The write performance drops a lot because Themis uses two-phase commit protocol to achieve ACID of transaction. For multi-row write, we improve the performance by paralleling all writes of pre-write phase. For single-row write, we are optimizing two-phase commit protocol to achieve better performance and will update the result when it is ready. The details of performance result could be found in github. The repository and introduction of Themis include: 1. Themis github: https://github.com/XiaoMi/themis/. The source code, performance test result and user guide could be found here. 2. Themis jira : https://issues.apache.org/jira/browse/HBASE-10999 3. Chronos github: https://github.com/XiaoMi/chronos. Chronos is our open-source high-availability, high-performance timestamp server to provide global strictly incremental timestamp for Themis. If you find Themis interesting, please leave us comment in the mail, jira or github. Best cuijianwei
Re: Themis : implements cross-row/corss-table transaction on HBase.
Jianwei: You may want to update the comment for ThemisScan : //a wrapper class of Put in HBase which not expose timestamp to user public class ThemisScan extends ThemisRead { Is there plan to support append / increment as part of the transaction ? Currently Themis depends on 0.94.11 Is there plan to support 0.96+ releases ? Thanks On Tue, Jul 8, 2014 at 12:34 AM, 崔建伟 cuijian...@xiaomi.com wrote: Hi everyone, I want to introduce our open-source project Themis which implements cross-row/corss-table transaction on HBase. Themis follows google's percolator algorithm( http://research.google.com/pubs/pub36726.html), which provides ACID-compliant transaction and snapshot isolation. The cross-row transaction is based on HBase's single-row atomic semantics and doesn't use a central transaction server, so that supports linear-scalability. Themis depends on a timestamp server to provides global strictly incremental timestamp to define the order of transactions, which will be used to resolve the write-write and read-write conflicts. The timestamp server is lightweight and could achieve hight throughput(500, 000 + qps), and Themis will batch timestamp requests across transactions in one Rpc, so that it won't become the bottleneck of the system even when processing billions of transactions every day. Although Themis could be implemented totally in client-side, we adopt coprocessor framework of HBase to achieve higher performance. Themis includes a client-side library to provides transaction APIs, such as themisPut/themisGet/themisScan/themisDelete, and a coprocessor library loaded on regionserver. Therefore, Themis could be used without changing the code and logic of HBase. We have been validating the correctness of Themis for a few months by a AccountTransfer simulation program, which concurrently does cross-row transactions by transferring money among different accounts(each account is a row in HBase) and verifies total money of all accounts doesn't change in the simulation. We have also run Themis on our production environment. We test the performance of Themis and get comparable result as percolator. The single-column transaction represents the worst performance case for Themis compared with HBase, the result is: 1) For read, the performance of percolator is 90% of HBase; 2) For write, the performance of percolator is 23% of HBase. The write performance drops a lot because Themis uses two-phase commit protocol to achieve ACID of transaction. For multi-row write, we improve the performance by paralleling all writes of pre-write phase. For single-row write, we are optimizing two-phase commit protocol to achieve better performance and will update the result when it is ready. The details of performance result could be found in github. The repository and introduction of Themis include: 1. Themis github: https://github.com/XiaoMi/themis/. The source code, performance test result and user guide could be found here. 2. Themis jira : https://issues.apache.org/jira/browse/HBASE-10999 3. Chronos github: https://github.com/XiaoMi/chronos. Chronos is our open-source high-availability, high-performance timestamp server to provide global strictly incremental timestamp for Themis. If you find Themis interesting, please leave us comment in the mail, jira or github. Best cuijianwei
Re: Themis : implements cross-row/corss-table transaction on HBase.
Hi Jianwei, You may also want to take a look at the generic client transaction API being proposed in HBASE-11447: https://issues.apache.org/jira/browse/HBASE-11447 I think it would be useful to have the Themis perspective there, and whether the proposed API meets your needs and requirements. On Tue, Jul 8, 2014 at 9:10 AM, Ted Yu yuzhih...@gmail.com wrote: Jianwei: You may want to update the comment for ThemisScan : //a wrapper class of Put in HBase which not expose timestamp to user public class ThemisScan extends ThemisRead { Is there plan to support append / increment as part of the transaction ? Currently Themis depends on 0.94.11 Is there plan to support 0.96+ releases ? Thanks On Tue, Jul 8, 2014 at 12:34 AM, 崔建伟 cuijian...@xiaomi.com wrote: Hi everyone, I want to introduce our open-source project Themis which implements cross-row/corss-table transaction on HBase. Themis follows google's percolator algorithm( http://research.google.com/pubs/pub36726.html), which provides ACID-compliant transaction and snapshot isolation. The cross-row transaction is based on HBase's single-row atomic semantics and doesn't use a central transaction server, so that supports linear-scalability. Themis depends on a timestamp server to provides global strictly incremental timestamp to define the order of transactions, which will be used to resolve the write-write and read-write conflicts. The timestamp server is lightweight and could achieve hight throughput(500, 000 + qps), and Themis will batch timestamp requests across transactions in one Rpc, so that it won't become the bottleneck of the system even when processing billions of transactions every day. Although Themis could be implemented totally in client-side, we adopt coprocessor framework of HBase to achieve higher performance. Themis includes a client-side library to provides transaction APIs, such as themisPut/themisGet/themisScan/themisDelete, and a coprocessor library loaded on regionserver. Therefore, Themis could be used without changing the code and logic of HBase. We have been validating the correctness of Themis for a few months by a AccountTransfer simulation program, which concurrently does cross-row transactions by transferring money among different accounts(each account is a row in HBase) and verifies total money of all accounts doesn't change in the simulation. We have also run Themis on our production environment. We test the performance of Themis and get comparable result as percolator. The single-column transaction represents the worst performance case for Themis compared with HBase, the result is: 1) For read, the performance of percolator is 90% of HBase; 2) For write, the performance of percolator is 23% of HBase. The write performance drops a lot because Themis uses two-phase commit protocol to achieve ACID of transaction. For multi-row write, we improve the performance by paralleling all writes of pre-write phase. For single-row write, we are optimizing two-phase commit protocol to achieve better performance and will update the result when it is ready. The details of performance result could be found in github. The repository and introduction of Themis include: 1. Themis github: https://github.com/XiaoMi/themis/. The source code, performance test result and user guide could be found here. 2. Themis jira : https://issues.apache.org/jira/browse/HBASE-10999 3. Chronos github: https://github.com/XiaoMi/chronos. Chronos is our open-source high-availability, high-performance timestamp server to provide global strictly incremental timestamp for Themis. If you find Themis interesting, please leave us comment in the mail, jira or github. Best cuijianwei