Recently we made some changes (based on our learnings from Hadoop summit day 2). As a result of these changes, we are seeing different behavior. I will appreciate if you can help me understand the new behavior.
We used to flush all of our tables every night. I disabled major compactions, removed the cron that flushed the tables and replaced it with a cron that does major compactions every night. Since then we started getting the following error in our logs: *org.apache.hadoop.hbase.regionserver.wal.HLog:* *Too* *many* *hlogs:* *logs=33*, *maxlogs=32*; *forcing* *flush* *of* *59* *regions*(*s*) Whenever this happens, the regionserver performance degrades considerably. gets slow down a lot, writes are not affected. (rpc.metrics.get_avg_time) goes up. This problem lasts for about 30 to 40 minutes and after that the cluster recovers. Besides this we are also seeing that the minor compactions have stopped (after disabling major compaction). This wasn't happening before. Cluster configuration - 9 nodes, 4G ram per regionserver, 66 regions per node, 3 tables. Why is it happening? Is there way to avoid this? What can we do to make sure that this doesn't happen again? Do I need to make sure that I am flushing before I do major compaction at night? Do I need to cron minor compaction (HBaseAdmin.compact)? Regards, Vaibhav Puranik GumGum
