Akmal, We have been suffering the issue for two years now without a good solution. From what I learned, it is not really a good idea to do heavy online hbase puts. The first thing you encounter will be performance caused by compact no matter how you tune parameters. Then later on you will see job failures because hbase operation timeouts and/or region server crashes.
Light writes, heavy reads are generally OK. For heavy puts, the best practice is to prepare tables offline, then turn it on for reads. If online heavy puts not avoidable, you might get the best out of it if you manage compact/split by yourself. Meaning when # of files per region reaches certain number, stops writing, performs compacts and splits with large regions; then resume writing. I hope it helps. Frank Luo From: Akmal Abbasov [mailto:akmal.abba...@icloud.com] Sent: Tuesday, March 08, 2016 10:29 AM To: user@hbase.apache.org Subject: HBase poor write performance Hi, I'm testing HBase to choose the right hardware configurations for a heavy write use case. I'm testing using YCSB. The cluster consist of 2 masters, and 5 regionservers(4 cores, 14GB ram, 4x512GB SSD). I've created a new table in HBase, presplit it to 50 regions. I'm running 3 clients each running 50 threads, to insert data. I'm using the default HBase settings. After running few tests, I can see that the cluster is underutilized, in fact memory usage is around 30%. The main problem I see for now is compactions, compactionQueueLength is growing very fast, and compaction process is always running. I found that there are hbase.regionserver.thread.compaction.small and hbase.regionserver.thread.compaction.large but couldn't find information regarding their default values. I am also planing to increase the regions number and the memstore size to increase utilization of the cluster and performance. Which other settings should be tuned to improve both utilization and performance? Thank you. I'm using HBase 0.98.7 and regionserver heap size is 7GB. Regards, Akmal This email and any attachments transmitted with it are intended for use by the intended recipient(s) only. If you have received this email in error, please notify the sender immediately and then delete it. If you are not the intended recipient, you must not keep, use, disclose, copy or distribute this email without the author’s prior permission. We take precautions to minimize the risk of transmitting software viruses, but we advise you to perform your own virus checks on any attachment to this message. We cannot accept liability for any loss or damage caused by software viruses. The information contained in this communication may be confidential and may be subject to the attorney-client privilege.