bq. two regions were in transition Can you pastebin related server logs w.r.t. these two regions so that we can have more clue ?
For #2, please see http://hbase.apache.org/book.html#big.cluster.config For #3, please see http://hbase.apache.org/book.html#_running_multiple_workloads_on_a_single_cluster On Wed, Feb 24, 2016 at 3:31 PM, Heng Chen <heng.chen.1...@gmail.com> wrote: > The story is I run one MR job on my production cluster (0.98.6), it needs > to scan one table during map procedure. > > Because of the heavy load from the job, all my RS crashed due to OOM. > > After i restart all RS, i found one problem. > > All regions were reopened on one RS, and balancer could not run because of > two regions were in transition. The cluster got in stuck a long time > until i restarted master. > > 1. why this happened? > > 2. If cluster has a lots of regions, after all RS crash, how to restart > the cluster. If restart RS one by one, it means OOM may happen because one > RS has to hold all regions and it will cost a long time. > > 3. Is it possible to make each table with some requests quotas, it means > when one table is requested heavily, it has no impact to other tables on > cluster. > > > Thanks >