Hi,


Do you mean row/sec by ops/sec? or partition/sec (in cassandra terms), if so 
then how many rows per op or partition? what's your data model and the host 
spec?

Is your client remote or on the host?

Sent using https://www.zoho.com/mail/




---- On Wed, 16 Sep 2020 14:11:35 +0430 Sergey Semenoff 
<box4semen...@gmail.com> wrote ----


Hi *! 
 
I think everybody who working with the real BigData know – performance is 
very important. 
 
Unfortunaly our lovely HBase slower then Cassandra approximately in 2 times 
when reading huge amount of data. 
 
 
For example – this is Cassandra the performance test run from 2 hosts 
(client side) 
 
Host1 - Throughput(ops/sec), 231 021 
 
Host2 - Throughput(ops/sec), 224 691 
 
 
 
Summary ~450 000. 
 
HBase shows in the same conditions only 210 000. 
 
 
 
Maybe this is one of the reason why Cassandra is more popular (see 
https://db-engines.com/en/ranking/wide+column+store) 
 
I’ve done an improvment which can make HBase faster up 2-3 times (it 
depends of many reasons, and sometimes even faster). 
 
With the improvement HBase speed up to 430 000 ops/sec. 
 
See the picture in attachment. 
 
 
 
If you interested to get this improvement in release you can help to 
attract some developers attention here - 
https://issues.apache.org/jira/browse/HBASE-23887 
 
Put some line there with your opinion and vote if you think it could be 
useful for your work. 
 
I believe discussion about this approach can make HBase more useful and 
popular. 
 
 
 
Thanks for attention) 
 
With the best regards, 
 
Pustota

Reply via email to