Thanks everyone. I have 192G RAM on hbase machines in the cluster and out of it around 100-120G is used in user processes and rest of it for caching. The swap graphs I have show me that the swapping happens at a rate of 100 million kb/second.
I have disabled swappiness on my machines. but I still have the same issue of latency spikes during swapping. Thanks, Girish. On Mon, Nov 2, 2015 at 8:07 PM, Yu Li <car...@gmail.com> wrote: > Second Vlad. Swapping is necessary in many situations, but as JVM does not > behave well under swapping, HBase may run into trouble if swapped. Search > for "swapping" in hbase book <http://hbase.apache.org/book.html> and you > could see the same suggestion. > > Best Regards, > Yu > > On 3 November 2015 at 11:59, Vladimir Rodionov <vladrodio...@gmail.com> > wrote: > > > >> Do you have any specific suggestions to avoid swapping during hbase > > compactions. > > > > You can google "disable swap on linux", make sure that you do > > not overprovision your system's RAM (too many processes running and > > consuming all physical memory), monitor swap usage with vmstat. > > > > -Vlad > > > > On Mon, Nov 2, 2015 at 4:20 PM, Girish Joshi <gjo...@groupon.com.invalid > > > > wrote: > > > > > Thanks. Do you have any specific suggestions to avoid swapping during > > hbase > > > compactions. > > > > > > Thanks, > > > > > > Girish. > > > > > > On Sun, Nov 1, 2015 at 6:25 PM, Vladimir Rodionov < > > vladrodio...@gmail.com> > > > wrote: > > > > > > > >>- There is a spike in compaction time avg time metric. At the same > > time > > > > the > > > > >>swap bytes in and swap bytes out also have higher value. > > > > > > > > Swapping is bad. You have to avoid it. > > > > > > > > -Vlad > > > > > > > > On Sun, Nov 1, 2015 at 10:24 AM, Girish Joshi > > <gjo...@groupon.com.invalid > > > > > > > > wrote: > > > > > > > > > Hello > > > > > > > > > > In my hbase cluster, I observe the following consistently happening > > > over > > > > > several days:- > > > > > > > > > > - There is a spike in compaction time avg time metric. At the same > > time > > > > the > > > > > swap bytes in and swap bytes out also have higher value. > > > > > - Around the same time, I see the FS PRead and FS Read latencies > and > > > > client > > > > > latencies doing random reads increase. > > > > > > > > > > My hbase cluster consisting of 16 nodes and setup with a > replication > > to > > > > > another cluster of 16 nodes has the following workload:- > > > > > > > > > > - There are around 4 tables which have lot of write activity(around > > > 500k > > > > > per second writes on m1/m15 moving average). 2 of these tables have > > > > atomic > > > > > counter columns keeping track of some analytics data and being > > > > incremented > > > > > with every write. > > > > > > > > > > - There are 2 tables which receive bulk uploaded data > > > periodically(around > > > > > once a day) > > > > > > > > > > - We expect reads at around 100k per second mainly from tables > which > > > have > > > > > bulk upload data and the one which has counter columns. The read > > > > > latencies(p99) spike up to around 1000-5000 ms when the above > > > compaction > > > > > time avg time metric increases. In other times, they are below 100 > > ms. > > > > > > > > > > I have set the hbase.hregion.majorcompaction to 0 on region > servers; > > I > > > > plan > > > > > to set it to 0 on master nodes too so that I can take out the > > > possibility > > > > > of time triggered major compactions being the problem. But I > suspect > > > > there > > > > > are lot of minor compactions and those leading to major compactions > > > > > happening at the time of spikes. > > > > > > > > > > *Any suggestions on how to avoid this situation of read latency > > spikes > > > > and > > > > > have better read performance?* > > > > > > > > > > Thanks, > > > > > > > > > > Girish. > > > > > > > > > > > > > > >