[ 
https://issues.apache.org/jira/browse/HBASE-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13013689#comment-13013689
 ] 

stack commented on HBASE-3071:
------------------------------

Other notes:

Currently I run it like this:

for i in `cat regionserver`; do ./bin/graceful.sh SERVERNAME; done

You should turn off the balancer before you do the above.  The region_mover.rb 
doesn't care in that if regions show up since it started, it'll just start in 
on the new ones until its down to zero regions (though could be race in here if 
balancer is running).  Script doesn't turn it on/off because need to trap to 
turn it back on again AND the current api for balancer is dumb; there is no way 
to query current state... so this is manual step for now.

We can move off ~2 regions per second on unloaded cluster.  Moving back on the 
regions takes longer for some reason -- about 1 a second.  This means a rolling 
restart could take a while on a big loaded cluster.  Could parallellize this 
script but would need more work to make sure concurrent graceful_restarts all 
read a common set of restarting servers.



> Graceful decommissioning of a regionserver
> ------------------------------------------
>
>                 Key: HBASE-3071
>                 URL: https://issues.apache.org/jira/browse/HBASE-3071
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>         Attachments: 3071.txt, 3701-v2.txt
>
>
> Currently if you stop a regionserver nicely, it'll put up its stopping flag 
> and then close all hosted regions.  While the stopping flag is in place all 
> region requests are rejected.  If this server was under load, closing could 
> take a while.  Only after all is closed is the master informed and it'll 
> restart assigning (in old master, master woud get a report with list of all 
> regions closed, in new master the zk expired is triggered and we'll run 
> shutdown handler).
> At least in new master, we have means of disabling balancer, and then moving 
> the regions off the server one by one via HBaseAdmin methods -- we shoud 
> write a script to do this at least for rolling restarts -- but we need 
> something better.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to