Re: HBase 2 slower than HBase 1?

2020-06-25 Thread Andrew Purtell
I repeated this test with pe --filterAll and the results were revealing, at least for this case. I also patched in thread local hash map for atomic counters that I could update from code paths in SQM, StoreScanner, HFileReader*, and HFileBlock. Because a RPC is processed by a single handler thread

Re: HBase 2 slower than HBase 1?

2020-06-11 Thread Andrew Purtell
I used PE to generate 10M row tables with one family with either 1, 10, 20, 50, or 100 values per row (unique column-qualifiers). An increase in wall clock time was noticeable, for example: 1.6.0 time ./bin/hbase pe --rows=500 --table=TestTable_f1_c20 --columns=20 --nomapred scan 2 real 1m20.

Re: HBase 2 slower than HBase 1?

2020-06-11 Thread Jan Van Besien
This is promising, thanks a lot. Testing with hbase 2.2.5 shows an improvement, but we're not there yet. As reported earlier, hbase 2.1.0 was about 60% slower than hbase 1.2.0 in a test that simply scans all the regions in parallel without any filter. A test with hbase 2.2.5 shows it to be about 4

Re: HBase 2 slower than HBase 1?

2020-06-11 Thread Anoop John
In another mail thread Zheng Hu brought up an important Jra fix https://issues.apache.org/jira/browse/HBASE-21657 Can u pls check with this once? Anoop On Tue, Jun 9, 2020 at 8:08 PM Jan Van Besien wrote: > On Sun, Jun 7, 2020 at 7:49 AM Anoop John wrote: > > As per the above configs, it look

Re: HBase 2 slower than HBase 1?

2020-06-09 Thread Jan Van Besien
On Sun, Jun 7, 2020 at 7:49 AM Anoop John wrote: > As per the above configs, it looks like Bucket Cache is not being used. > Only on heap LRU cache in use. True (but it is large enough to hold everything, so I don't think it matters). > @Jan - Is it possible for you to test with off heap Bucket

Re: HBase 2 slower than HBase 1?

2020-06-06 Thread Anoop John
As per the above configs, it looks like Bucket Cache is not being used. Only on heap LRU cache in use. @Jan - Is it possible for you to test with off heap Bucket Cache? Config bucket cache off heap mode with size ~7.5 GB Do you have any DataBlockEncoding enabled on the CF? Anoop On Fri, Jun 5,

Re: HBase 2 slower than HBase 1?

2020-06-05 Thread Duo Zhang
IIRC on branch-2.x, there are some changes on the bucket cache implementation? Off-heap by default? Not sure if this is the problem. Jan Van Besien 于2020年6月5日周五 下午9:17写道: > On Fri, Jun 5, 2020 at 2:54 PM 张铎(Duo Zhang) > wrote: > > So the result is for all data in block cache? What is the block

Re: HBase 2 slower than HBase 1?

2020-06-05 Thread Jan Van Besien
On Fri, Jun 5, 2020 at 2:54 PM 张铎(Duo Zhang) wrote: > So the result is for all data in block cache? What is the block cache > settings for these two clusters? yes. Configuration: - hfile.block.cache.size = 0.4 - heap size 18.63 GiB (don't ask me why it was not 20 ;-)) some relevant statistics f

Re: HBase 2 slower than HBase 1?

2020-06-05 Thread Duo Zhang
So the result is for all data in block cache? What is the block cache settings for these two clusters? Jan Van Besien 于2020年6月5日 周五20:46写道: > Bruno and I continued to do some testing (we work on the same > project), this time also on actual (3 node) clusters, using hbase > 1.2.0 and hbase 2.1.0.

Re: HBase 2 slower than HBase 1?

2020-06-05 Thread Jan Van Besien
Bruno and I continued to do some testing (we work on the same project), this time also on actual (3 node) clusters, using hbase 1.2.0 and hbase 2.1.0. We can certainly confirm that 2.1.0 is often faster than 1.2.0, but not always, and tried to isolate scenarios in which 2.1.0 is (much) slower than

Re: HBase 2 slower than HBase 1?

2020-06-04 Thread Andrew Purtell
The same settings, same instances, same jvm for 1.6.0 and 2.2.4. On Thu, Jun 4, 2020 at 3:35 PM Tak-Lon Wu wrote: > hey guys, I got a question on the performance test between 1.6.0 and 2.2.4 > . > > To Andrew, did you turn on the performance tuning on 1.6.0 as well ? or > did you run it without

Re: HBase 2 slower than HBase 1?

2020-06-04 Thread Stephen
hey guys, I got a question on the performance test between 1.6.0 and 2.2.4 . To Andrew, did you turn on the performance tuning on 1.6.0 as well ? or did you run it without any configuration on 1.6.0 ? GC: -XX:+UseShenandoahGC -Xms31g -Xmx31g -XX:+AlwaysPreTouch -XX:+UseNUMA -XX:-UseBiasedLocki

Re: HBase 2 slower than HBase 1?

2020-05-25 Thread Bruno Dumon
Thanks a lot for doing this test. Its results are encouraging. My non-cluster testing was more focussed on full table scans, which YSCB does not do. The full table scans are only done by batch jobs, so if they are a bit slower it is not much of a problem, but in our case they seemed a lot slower.

Re: HBase 2 slower than HBase 1?

2020-05-22 Thread Andrew Purtell
Thank you Lars. I suppose it is not possible to characterize the problem with anonymous detail enough to provide some clues for follow up, or you would have done it. > On May 22, 2020, at 6:01 AM, Lars Francke wrote: > > I've refrained from commenting here so far because I cannot share much/

Re: HBase 2 slower than HBase 1?

2020-05-22 Thread Lars Francke
I've refrained from commenting here so far because I cannot share much/any data but I can also report that we've seen worse performance with HBase 2 (similar/same settings and same workload, same hardware). This is on a 40+ node cluster. Unfortunately, I wasn't tasked with debugging. The customer d

Re: HBase 2 slower than HBase 1?

2020-05-21 Thread Andrew Purtell
It depends what you are measuring and how. I test every so often with YCSB, which admittedly is not representative of real world workloads but is widely used for apples to apples testing among datastores, and we can apply the same test tool and test methodology to different versions to get meaningf

Re: HBase 2 slower than HBase 1?

2020-05-20 Thread Bruno Dumon
Hi, I think that (idle) background threads would not make much of a difference to the raw speed of iterating over cells of a single region served from the block cache. I started testing this way after noticing slow down on a real installation. I can imagine that there have been various improvement

Re: HBase 2 slower than HBase 1?

2020-05-20 Thread Duo Zhang
Just saw that your tests were on local mode... Local mode is not for production so I do not see any related issues for improving the performance for hbase in local mode. Maybe we just have more threads in HBase 2 by default which makes it slow on a single machine, not sure... Could you please tes

Re: HBase 2 slower than HBase 1?

2020-05-20 Thread Bruno Dumon
For the scan test, there is only minimal rpc involved, I verified through ScanMetrics that there are only 2 rpc calls for the scan. It is essentially testing how fast the region server is able to iterate over the cells. There are no delete cells, and the table is fully compacted (1 storage file), a

Re: HBase 2 slower than HBase 1?

2020-05-20 Thread Debraj Manna
I cross-posted this in slack channel as I was also observing something quite similar. This is the suggestion I received. Reposting here for the completion. zhangduo 12:15 PM Does get also have the same performance drop, or only scan? zhangduo 12:18 PM For the rpc layer, hbase2 defaults to netty

HBase 2 slower than HBase 1?

2020-05-18 Thread Bruno Dumon
Hi, We are looking into migrating from HBase 1.2.x to HBase 2.1.x (on Cloudera CDH). It seems like HBase 2 is slower than HBase 1 for both reading and writing. I did a simple test, using HBase 1.6.0 and HBase 2.2.4 (the standard OSS versions), running in local mode (no HDFS) on my computer: *