Re: Compaction monitoring

2017-05-06 Thread Vladimir Rodionov
The major issue with HBase compactions not an excessive CPU or IO usage but excessive temporary (garbage) objects creation, which results in a more frequent GC failures and in a some cases - RS shut downs due to long GC pauses. That is why so important to keep compactions under control: disable

Re: Compaction monitoring

2017-05-05 Thread Alexander Ilyin
Kevin, Thanks for your answer. We're using Ambari to manage our cluster. I see an increase of CPU usage and IO but it's not a big one. And this increase tends to be at the beginning of off-peak window although it's difficult to tell for sure since our workload comes in bursts and the picture is

Re: Compaction monitoring

2017-05-05 Thread Kevin O'Dell
Alexander, That is a great series of questions. What are you using for instrumentation of your HBase cluster? Cloudera Manager, Ambari, Ganglia, Cacti, etc? You are really asking a lot of performance based metric questions. I don't think you will be able to answer your questions without first

Compaction monitoring

2017-05-05 Thread Alexander Ilyin
Hi, Tuning HBase performance I've found a lot of settings which affect compaction process (off-peak hours, time between compactions, compaction ratio, region sizes, etc.). They all seem to be useful and there are recommendations in the doc saying which values to set. But I found no way to assess