Re: Some problems in one accident on my production cluster

2016-02-24 Thread Heng Chen
Thanks stack and ted for your help.

After check the code, i think the reason is RS send split request with
parent region, two daughter regions,  then RS crash.

Master update two daughter regions to be SPLIT_NEW state and put them
in regionsInTransition
which is stored in memory of master.

And in 0.98.11-,  serverOffline not handle this situation when region is in
SPLIT_NEW state. So we have to restart master.

As ted said, HBASE-12958 has fixed it.

As for "set_quota" command, it was introduced after 1.1,  i will upgrade my
cluster.

Thanks guys for your help.



2016-02-25 11:41 GMT+08:00 Stack :

> On Wed, Feb 24, 2016 at 3:31 PM, Heng Chen 
> wrote:
>
> > The story is I run one MR job on my production cluster (0.98.6),   it
> needs
> > to scan one table during map procedure.
> >
> > Because of the heavy load from the job,  all my RS crashed due to OOM.
> >
> >
> Really big rows? If so, can you narrow your scan or ask for partial rows
> (IIRC, you can do this in 0.98.x) or move up on to hbase 1.1+ where
> scanning does 'chunking'?
>
>
> > After i restart all RS,  i found one problem.
> >
> > All regions were reopened on one RS,
>
>
>
> ... the others took a while to check in? Thats usual reason one RS gets a
> bunch of regions.
>
>
>
> > and balancer could not run because of
> > two regions were in transition.   The cluster got in stuck a long time
> > until i restarted master.
> >
> > 1.  why this happened?
> >
> > Would need logs. I see you posted some later. Good to go to the server
> that was doing the split and look in log around the time of split fail.
>
>
> > 2.  If cluster has a lots of regions, after all RS crash,  how to restart
> > the cluster.  If restart RS one by one, it means OOM may happen because
> one
> > RS has to hold all regions and it will cost a long time.
> >
> >
> Best to restart cluster in this case (after figuring why others took a
> while to check in... look at their logs around startup time to see why they
> dally)
>
>
> > 3.  Is it possible to make each table with some requests quotas,  it
> means
> > when one table is requested heavily, it has no impact to other tables on
> > cluster.
> >
> >
> Not sure what the state of this is in 0.98. Maybe someone closer to 0.98
> knows.
>
> St.Ack
>
>
>
> >
> > Thanks
> >
>


Re: Some problems in one accident on my production cluster

2016-02-24 Thread Stack
On Wed, Feb 24, 2016 at 3:31 PM, Heng Chen  wrote:

> The story is I run one MR job on my production cluster (0.98.6),   it needs
> to scan one table during map procedure.
>
> Because of the heavy load from the job,  all my RS crashed due to OOM.
>
>
Really big rows? If so, can you narrow your scan or ask for partial rows
(IIRC, you can do this in 0.98.x) or move up on to hbase 1.1+ where
scanning does 'chunking'?


> After i restart all RS,  i found one problem.
>
> All regions were reopened on one RS,



... the others took a while to check in? Thats usual reason one RS gets a
bunch of regions.



> and balancer could not run because of
> two regions were in transition.   The cluster got in stuck a long time
> until i restarted master.
>
> 1.  why this happened?
>
> Would need logs. I see you posted some later. Good to go to the server
that was doing the split and look in log around the time of split fail.


> 2.  If cluster has a lots of regions, after all RS crash,  how to restart
> the cluster.  If restart RS one by one, it means OOM may happen because one
> RS has to hold all regions and it will cost a long time.
>
>
Best to restart cluster in this case (after figuring why others took a
while to check in... look at their logs around startup time to see why they
dally)


> 3.  Is it possible to make each table with some requests quotas,  it means
> when one table is requested heavily, it has no impact to other tables on
> cluster.
>
>
Not sure what the state of this is in 0.98. Maybe someone closer to 0.98
knows.

St.Ack



>
> Thanks
>


Re: Some problems in one accident on my production cluster

2016-02-24 Thread Ted Yu
bq. RegionStates: THIS SHOULD NOT HAPPEN: unexpected {
ad283942aff2bba6c0b94ff98a904d1a state=SPLITTING_NEW

Looks like the above wouldn't have happened if you are using 0.98.11+

See HBASE-12958

On Wed, Feb 24, 2016 at 6:39 PM, Heng Chen  wrote:

> I pick up some logs in master.log about one region
> "ad283942aff2bba6c0b94ff98a904d1a"
>
>
> 2016-02-24 16:24:35,610 INFO  [AM.ZK.Worker-pool2-t3491]
> master.RegionStates: Transition null to {ad283942aff2bba6c0b94ff98a904d1a
> state=SPLITTING_NEW, ts=1456302275610,
> server=dx-common-regionserver1-online,60020,1456302268068}
> 2016-02-24 16:25:40,472 WARN
>  [MASTER_SERVER_OPERATIONS-dx-common-hmaster1-online:6-0]
> master.RegionStates: THIS SHOULD NOT HAPPEN: unexpected
> {ad283942aff2bba6c0b94ff98a904d1a state=SPLITTING_NEW, ts=1456302275610,
> server=dx-common-regionserver1-online,60020,1456302268068}
> 2016-02-24 16:34:24,769 DEBUG
> [dx-common-hmaster1-online,6,1433937470611-BalancerChore]
> master.HMaster: Not running balancer because 2 region(s) in transition:
> {ad283942aff2bba6c0b94ff98a904d1a={ad283942aff2bba6c0b94ff98a904d1a
> state=SPLITTING_NEW, ts=1456302275610,
> server=dx-common-regionserver1-online,60020,1456302268068},
> ab07d6fbcef39be032ba11ca6ba252ef={ab07d6fbcef39be032ba11ca6ba252ef
> state=SPLITTING_NEW...
> 2016-02-24 16:39:24,768 DEBUG
> [dx-common-hmaster1-online,6,1433937470611-BalancerChore]
> master.HMaster: Not running balancer because 2 region(s) in transition:
> {ad283942aff2bba6c0b94ff98a904d1a={ad283942aff2bba6c0b94ff98a904d1a
> state=SPLITTING_NEW, ts=1456302275610,
> server=dx-common-regionserver1-online,60020,1456302268068},
> ab07d6fbcef39be032ba11ca6ba252ef={ab07d6fbcef39be032ba11ca6ba252ef
> state=SPLITTING_NEW...
> 2016-02-24 16:44:24,768 DEBUG
> [dx-common-hmaster1-online,6,1433937470611-BalancerChore]
> master.HMaster: Not running balancer because 2 region(s) in transition:
> {ad283942aff2bba6c0b94ff98a904d1a={ad283942aff2bba6c0b94ff98a904d1a
> state=SPLITTING_NEW, ts=1456302275610,
> server=dx-common-regionserver1-online,60020,1456302268068},
> ab07d6fbcef39be032ba11ca6ba252ef={ab07d6fbcef39be032ba11ca6ba252ef
> state=SPLITTING_NEW...
> 2016-02-24 16:45:37,749 DEBUG [FifoRpcScheduler.handler1-thread-10]
> master.HMaster: Not running balancer because 2 region(s) in transition:
> {ad283942aff2bba6c0b94ff98a904d1a={ad283942aff2bba6c0b94ff98a904d1a
> state=SPLITTING_NEW, ts=1456302275610,
> server=dx-common-regionserver1-online,60020,1456302268068},
> ab07d6fbcef39be032ba11ca6ba252ef={ab07d6fbcef39be032ba11ca6ba252ef
> state=SPLITTING_NEW...
> 2016-02-24 16:49:24,769 DEBUG
> [dx-common-hmaster1-online,6,1433937470611-BalancerChore]
> master.HMaster: Not running balancer because 2 region(s) in transition:
> {ad283942aff2bba6c0b94ff98a904d1a={ad283942aff2bba6c0b94ff98a904d1a
> state=SPLITTING_NEW, ts=1456302275610,
> server=dx-common-regionserver1-online,60020,1456302268068},
> ab07d6fbcef39be032ba11ca6ba252ef={ab07d6fbcef39be032ba11ca6ba252ef
> state=SPLITTING_NEW...
> 2016-02-24 16:54:24,768 DEBUG
> [dx-common-hmaster1-online,6,1433937470611-BalancerChore]
> master.HMaster: Not running balancer because 2 region(s) in transition:
> {ad283942aff2bba6c0b94ff98a904d1a={ad283942aff2bba6c0b94ff98a904d1a
> state=SPLITTING_NEW, ts=1456302275610,
> server=dx-common-regionserver1-online,60020,1456302268068},
> ab07d6fbcef39be032ba11ca6ba252ef={ab07d6fbcef39be032ba11ca6ba252ef
> state=SPLITTING_NEW...
> 2016-02-24 16:59:24,768 DEBUG
> [dx-common-hmaster1-online,6,1433937470611-BalancerChore]
> master.HMaster: Not running balancer because 2 region(s) in transition:
> {ad283942aff2bba6c0b94ff98a904d1a={ad283942aff2bba6c0b94ff98a904d1a
> state=SPLITTING_NEW, ts=1456302275610,
> server=dx-common-regionserver1-online,60020,1456302268068},
> ab07d6fbcef39be032ba11ca6ba252ef={ab07d6fbcef39be032ba11ca6ba252ef
> state=SPLITTING_NEW...
> 2016-02-24 17:04:24,769 DEBUG
> [dx-common-hmaster1-online,6,1433937470611-BalancerChore]
> master.HMaster: Not running balancer because 2 region(s) in transition:
> {ad283942aff2bba6c0b94ff98a904d1a={ad283942aff2bba6c0b94ff98a904d1a
> state=SPLITTING_NEW, ts=1456302275610,
> server=dx-common-regionserver1-online,60020,1456302268068},
> ab07d6fbcef39be032ba11ca6ba252ef={ab07d6fbcef39be032ba11ca6ba252ef
> state=SPLITTING_NEW...
> 2016-02-24 17:09:24,768 DEBUG
> [dx-common-hmaster1-online,6,1433937470611-BalancerChore]
> master.HMaster: Not running balancer because 2 region(s) in transition:
> {ad283942aff2bba6c0b94ff98a904d1a={ad283942aff2bba6c0b94ff98a904d1a
> state=SPLITTING_NEW, ts=1456302275610,
> server=dx-common-regionserver1-online,60020,1456302268068},
> ab07d6fbcef39be032ba11ca6ba252ef={ab07d6fbcef39be032ba11ca6ba252ef
> state=SPLITTING_NEW...
>
>
>
>
>
> 2016-02-25 10:05 GMT+08:00 Ted Yu :
>
> > bq. two regions were in transition
> >
> > Can you pastebin related server logs w.r.t. these two regions so that we
> > can have more clue ?
> >
> > For #2, please see h

Re: Some problems in one accident on my production cluster

2016-02-24 Thread Heng Chen
Thanks @ted,   your suggestions about 2 and 3  are what i need !

2016-02-25 10:39 GMT+08:00 Heng Chen :

> I pick up some logs in master.log about one region
> "ad283942aff2bba6c0b94ff98a904d1a"
>
>
> 2016-02-24 16:24:35,610 INFO  [AM.ZK.Worker-pool2-t3491]
> master.RegionStates: Transition null to {ad283942aff2bba6c0b94ff98a904d1a
> state=SPLITTING_NEW, ts=1456302275610,
> server=dx-common-regionserver1-online,60020,1456302268068}
> 2016-02-24 16:25:40,472 WARN
>  [MASTER_SERVER_OPERATIONS-dx-common-hmaster1-online:6-0]
> master.RegionStates: THIS SHOULD NOT HAPPEN: unexpected
> {ad283942aff2bba6c0b94ff98a904d1a state=SPLITTING_NEW, ts=1456302275610,
> server=dx-common-regionserver1-online,60020,1456302268068}
> 2016-02-24 16:34:24,769 DEBUG
> [dx-common-hmaster1-online,6,1433937470611-BalancerChore]
> master.HMaster: Not running balancer because 2 region(s) in transition:
> {ad283942aff2bba6c0b94ff98a904d1a={ad283942aff2bba6c0b94ff98a904d1a
> state=SPLITTING_NEW, ts=1456302275610,
> server=dx-common-regionserver1-online,60020,1456302268068},
> ab07d6fbcef39be032ba11ca6ba252ef={ab07d6fbcef39be032ba11ca6ba252ef
> state=SPLITTING_NEW...
> 2016-02-24 16:39:24,768 DEBUG
> [dx-common-hmaster1-online,6,1433937470611-BalancerChore]
> master.HMaster: Not running balancer because 2 region(s) in transition:
> {ad283942aff2bba6c0b94ff98a904d1a={ad283942aff2bba6c0b94ff98a904d1a
> state=SPLITTING_NEW, ts=1456302275610,
> server=dx-common-regionserver1-online,60020,1456302268068},
> ab07d6fbcef39be032ba11ca6ba252ef={ab07d6fbcef39be032ba11ca6ba252ef
> state=SPLITTING_NEW...
> 2016-02-24 16:44:24,768 DEBUG
> [dx-common-hmaster1-online,6,1433937470611-BalancerChore]
> master.HMaster: Not running balancer because 2 region(s) in transition:
> {ad283942aff2bba6c0b94ff98a904d1a={ad283942aff2bba6c0b94ff98a904d1a
> state=SPLITTING_NEW, ts=1456302275610,
> server=dx-common-regionserver1-online,60020,1456302268068},
> ab07d6fbcef39be032ba11ca6ba252ef={ab07d6fbcef39be032ba11ca6ba252ef
> state=SPLITTING_NEW...
> 2016-02-24 16:45:37,749 DEBUG [FifoRpcScheduler.handler1-thread-10]
> master.HMaster: Not running balancer because 2 region(s) in transition:
> {ad283942aff2bba6c0b94ff98a904d1a={ad283942aff2bba6c0b94ff98a904d1a
> state=SPLITTING_NEW, ts=1456302275610,
> server=dx-common-regionserver1-online,60020,1456302268068},
> ab07d6fbcef39be032ba11ca6ba252ef={ab07d6fbcef39be032ba11ca6ba252ef
> state=SPLITTING_NEW...
> 2016-02-24 16:49:24,769 DEBUG
> [dx-common-hmaster1-online,6,1433937470611-BalancerChore]
> master.HMaster: Not running balancer because 2 region(s) in transition:
> {ad283942aff2bba6c0b94ff98a904d1a={ad283942aff2bba6c0b94ff98a904d1a
> state=SPLITTING_NEW, ts=1456302275610,
> server=dx-common-regionserver1-online,60020,1456302268068},
> ab07d6fbcef39be032ba11ca6ba252ef={ab07d6fbcef39be032ba11ca6ba252ef
> state=SPLITTING_NEW...
> 2016-02-24 16:54:24,768 DEBUG
> [dx-common-hmaster1-online,6,1433937470611-BalancerChore]
> master.HMaster: Not running balancer because 2 region(s) in transition:
> {ad283942aff2bba6c0b94ff98a904d1a={ad283942aff2bba6c0b94ff98a904d1a
> state=SPLITTING_NEW, ts=1456302275610,
> server=dx-common-regionserver1-online,60020,1456302268068},
> ab07d6fbcef39be032ba11ca6ba252ef={ab07d6fbcef39be032ba11ca6ba252ef
> state=SPLITTING_NEW...
> 2016-02-24 16:59:24,768 DEBUG
> [dx-common-hmaster1-online,6,1433937470611-BalancerChore]
> master.HMaster: Not running balancer because 2 region(s) in transition:
> {ad283942aff2bba6c0b94ff98a904d1a={ad283942aff2bba6c0b94ff98a904d1a
> state=SPLITTING_NEW, ts=1456302275610,
> server=dx-common-regionserver1-online,60020,1456302268068},
> ab07d6fbcef39be032ba11ca6ba252ef={ab07d6fbcef39be032ba11ca6ba252ef
> state=SPLITTING_NEW...
> 2016-02-24 17:04:24,769 DEBUG
> [dx-common-hmaster1-online,6,1433937470611-BalancerChore]
> master.HMaster: Not running balancer because 2 region(s) in transition:
> {ad283942aff2bba6c0b94ff98a904d1a={ad283942aff2bba6c0b94ff98a904d1a
> state=SPLITTING_NEW, ts=1456302275610,
> server=dx-common-regionserver1-online,60020,1456302268068},
> ab07d6fbcef39be032ba11ca6ba252ef={ab07d6fbcef39be032ba11ca6ba252ef
> state=SPLITTING_NEW...
> 2016-02-24 17:09:24,768 DEBUG
> [dx-common-hmaster1-online,6,1433937470611-BalancerChore]
> master.HMaster: Not running balancer because 2 region(s) in transition:
> {ad283942aff2bba6c0b94ff98a904d1a={ad283942aff2bba6c0b94ff98a904d1a
> state=SPLITTING_NEW, ts=1456302275610,
> server=dx-common-regionserver1-online,60020,1456302268068},
> ab07d6fbcef39be032ba11ca6ba252ef={ab07d6fbcef39be032ba11ca6ba252ef
> state=SPLITTING_NEW...
>
>
>
>
>
> 2016-02-25 10:05 GMT+08:00 Ted Yu :
>
>> bq. two regions were in transition
>>
>> Can you pastebin related server logs w.r.t. these two regions so that we
>> can have more clue ?
>>
>> For #2, please see http://hbase.apache.org/book.html#big.cluster.config
>>
>> For #3, please see
>>
>> http://hbase.apache.org/book.html#_running_multiple_workloads_on_a

Re: Some problems in one accident on my production cluster

2016-02-24 Thread Heng Chen
I pick up some logs in master.log about one region
"ad283942aff2bba6c0b94ff98a904d1a"


2016-02-24 16:24:35,610 INFO  [AM.ZK.Worker-pool2-t3491]
master.RegionStates: Transition null to {ad283942aff2bba6c0b94ff98a904d1a
state=SPLITTING_NEW, ts=1456302275610,
server=dx-common-regionserver1-online,60020,1456302268068}
2016-02-24 16:25:40,472 WARN
 [MASTER_SERVER_OPERATIONS-dx-common-hmaster1-online:6-0]
master.RegionStates: THIS SHOULD NOT HAPPEN: unexpected
{ad283942aff2bba6c0b94ff98a904d1a state=SPLITTING_NEW, ts=1456302275610,
server=dx-common-regionserver1-online,60020,1456302268068}
2016-02-24 16:34:24,769 DEBUG
[dx-common-hmaster1-online,6,1433937470611-BalancerChore]
master.HMaster: Not running balancer because 2 region(s) in transition:
{ad283942aff2bba6c0b94ff98a904d1a={ad283942aff2bba6c0b94ff98a904d1a
state=SPLITTING_NEW, ts=1456302275610,
server=dx-common-regionserver1-online,60020,1456302268068},
ab07d6fbcef39be032ba11ca6ba252ef={ab07d6fbcef39be032ba11ca6ba252ef
state=SPLITTING_NEW...
2016-02-24 16:39:24,768 DEBUG
[dx-common-hmaster1-online,6,1433937470611-BalancerChore]
master.HMaster: Not running balancer because 2 region(s) in transition:
{ad283942aff2bba6c0b94ff98a904d1a={ad283942aff2bba6c0b94ff98a904d1a
state=SPLITTING_NEW, ts=1456302275610,
server=dx-common-regionserver1-online,60020,1456302268068},
ab07d6fbcef39be032ba11ca6ba252ef={ab07d6fbcef39be032ba11ca6ba252ef
state=SPLITTING_NEW...
2016-02-24 16:44:24,768 DEBUG
[dx-common-hmaster1-online,6,1433937470611-BalancerChore]
master.HMaster: Not running balancer because 2 region(s) in transition:
{ad283942aff2bba6c0b94ff98a904d1a={ad283942aff2bba6c0b94ff98a904d1a
state=SPLITTING_NEW, ts=1456302275610,
server=dx-common-regionserver1-online,60020,1456302268068},
ab07d6fbcef39be032ba11ca6ba252ef={ab07d6fbcef39be032ba11ca6ba252ef
state=SPLITTING_NEW...
2016-02-24 16:45:37,749 DEBUG [FifoRpcScheduler.handler1-thread-10]
master.HMaster: Not running balancer because 2 region(s) in transition:
{ad283942aff2bba6c0b94ff98a904d1a={ad283942aff2bba6c0b94ff98a904d1a
state=SPLITTING_NEW, ts=1456302275610,
server=dx-common-regionserver1-online,60020,1456302268068},
ab07d6fbcef39be032ba11ca6ba252ef={ab07d6fbcef39be032ba11ca6ba252ef
state=SPLITTING_NEW...
2016-02-24 16:49:24,769 DEBUG
[dx-common-hmaster1-online,6,1433937470611-BalancerChore]
master.HMaster: Not running balancer because 2 region(s) in transition:
{ad283942aff2bba6c0b94ff98a904d1a={ad283942aff2bba6c0b94ff98a904d1a
state=SPLITTING_NEW, ts=1456302275610,
server=dx-common-regionserver1-online,60020,1456302268068},
ab07d6fbcef39be032ba11ca6ba252ef={ab07d6fbcef39be032ba11ca6ba252ef
state=SPLITTING_NEW...
2016-02-24 16:54:24,768 DEBUG
[dx-common-hmaster1-online,6,1433937470611-BalancerChore]
master.HMaster: Not running balancer because 2 region(s) in transition:
{ad283942aff2bba6c0b94ff98a904d1a={ad283942aff2bba6c0b94ff98a904d1a
state=SPLITTING_NEW, ts=1456302275610,
server=dx-common-regionserver1-online,60020,1456302268068},
ab07d6fbcef39be032ba11ca6ba252ef={ab07d6fbcef39be032ba11ca6ba252ef
state=SPLITTING_NEW...
2016-02-24 16:59:24,768 DEBUG
[dx-common-hmaster1-online,6,1433937470611-BalancerChore]
master.HMaster: Not running balancer because 2 region(s) in transition:
{ad283942aff2bba6c0b94ff98a904d1a={ad283942aff2bba6c0b94ff98a904d1a
state=SPLITTING_NEW, ts=1456302275610,
server=dx-common-regionserver1-online,60020,1456302268068},
ab07d6fbcef39be032ba11ca6ba252ef={ab07d6fbcef39be032ba11ca6ba252ef
state=SPLITTING_NEW...
2016-02-24 17:04:24,769 DEBUG
[dx-common-hmaster1-online,6,1433937470611-BalancerChore]
master.HMaster: Not running balancer because 2 region(s) in transition:
{ad283942aff2bba6c0b94ff98a904d1a={ad283942aff2bba6c0b94ff98a904d1a
state=SPLITTING_NEW, ts=1456302275610,
server=dx-common-regionserver1-online,60020,1456302268068},
ab07d6fbcef39be032ba11ca6ba252ef={ab07d6fbcef39be032ba11ca6ba252ef
state=SPLITTING_NEW...
2016-02-24 17:09:24,768 DEBUG
[dx-common-hmaster1-online,6,1433937470611-BalancerChore]
master.HMaster: Not running balancer because 2 region(s) in transition:
{ad283942aff2bba6c0b94ff98a904d1a={ad283942aff2bba6c0b94ff98a904d1a
state=SPLITTING_NEW, ts=1456302275610,
server=dx-common-regionserver1-online,60020,1456302268068},
ab07d6fbcef39be032ba11ca6ba252ef={ab07d6fbcef39be032ba11ca6ba252ef
state=SPLITTING_NEW...





2016-02-25 10:05 GMT+08:00 Ted Yu :

> bq. two regions were in transition
>
> Can you pastebin related server logs w.r.t. these two regions so that we
> can have more clue ?
>
> For #2, please see http://hbase.apache.org/book.html#big.cluster.config
>
> For #3, please see
>
> http://hbase.apache.org/book.html#_running_multiple_workloads_on_a_single_cluster
>
> On Wed, Feb 24, 2016 at 3:31 PM, Heng Chen 
> wrote:
>
> > The story is I run one MR job on my production cluster (0.98.6),   it
> needs
> > to scan one table during map procedure.
> >
> > Because of the heavy load from the job,  all my RS crashed due to OOM.
> >
> > Af

Re: Some problems in one accident on my production cluster

2016-02-24 Thread Ted Yu
bq. two regions were in transition

Can you pastebin related server logs w.r.t. these two regions so that we
can have more clue ?

For #2, please see http://hbase.apache.org/book.html#big.cluster.config

For #3, please see
http://hbase.apache.org/book.html#_running_multiple_workloads_on_a_single_cluster

On Wed, Feb 24, 2016 at 3:31 PM, Heng Chen  wrote:

> The story is I run one MR job on my production cluster (0.98.6),   it needs
> to scan one table during map procedure.
>
> Because of the heavy load from the job,  all my RS crashed due to OOM.
>
> After i restart all RS,  i found one problem.
>
> All regions were reopened on one RS,  and balancer could not run because of
> two regions were in transition.   The cluster got in stuck a long time
> until i restarted master.
>
> 1.  why this happened?
>
> 2.  If cluster has a lots of regions, after all RS crash,  how to restart
> the cluster.  If restart RS one by one, it means OOM may happen because one
> RS has to hold all regions and it will cost a long time.
>
> 3.  Is it possible to make each table with some requests quotas,  it means
> when one table is requested heavily, it has no impact to other tables on
> cluster.
>
>
> Thanks
>