[ https://issues.apache.org/jira/browse/HBASE-25594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Stack resolved HBASE-25594. ----------------------------------- Resolution: Fixed I pushed the below to 2.4+ {code} I pushed this to 2.3+ commit 728d4f5ab12fd2631b1ef0a7c61203e9acfb05f0 (HEAD -> 2.3, origin/branch-2.3) Author: Javier Akira Luca de Tena <akiral...@gmail.com> Date: Fri Mar 19 04:04:54 2021 +0900 HBOPS-25594 Make easier to use graceful_stop on localhost mode (#3054) Co-authored-by: Javier <javier.lucadet...@linecorp.com> diff --git a/bin/graceful_stop.sh b/bin/graceful_stop.sh index 89e3dd939c..e565929606 100755 --- a/bin/graceful_stop.sh +++ b/bin/graceful_stop.sh @@ -32,7 +32,7 @@ moving regions" echo " maxthreads xx Limit the number of threads used by the region mover. Default value is 1." echo " movetimeout xx Timeout for moving regions. If regions are not moved by the timeout value,\ exit with error. Default value is INT_MAX." - echo " hostname Hostname of server we are to stop" + echo " hostname Hostname to stop; match what HBase uses; pass 'localhost' if local to avoid ssh" echo " e|failfast Set -e so exit immediately if any command exits with non-zero status" echo " nob| nobalancer Do not manage balancer states. This is only used as optimization in \ rolling_restart.sh to avoid multiple calls to hbase shell" @@ -100,6 +100,10 @@ localhostname=`/bin/hostname` if [ "$localhostname" == "$hostname" ]; then local=true fi +if [ "$localhostname" == "$hostname" ] || [ "$hostname" == "localhost" ]; then + local=true + hostname=$localhostname +fi if [ "$nob" == "true" ]; then log "[ $0 ] skipping disabling balancer -nob argument is used" {code} > graceful_stop.sh fails to unload regions when ran at localhost > -------------------------------------------------------------- > > Key: HBASE-25594 > URL: https://issues.apache.org/jira/browse/HBASE-25594 > Project: HBase > Issue Type: Bug > Affects Versions: 3.0.0-alpha-1, 1.4.13 > Reporter: Javier Akira Luca de Tena > Assignee: Javier Akira Luca de Tena > Priority: Minor > Fix For: 3.0.0-alpha-1, 2.5.0, 2.4.3 > > > We usually use graceful_stop.sh from the Master to restart RegionServers. > However, in some scenarios we may not have privileges to restart remote > RegionServers (it uses ssh). > But we can still use graceful_stop.sh on the same host we want to restart. > In order to detect the execution at localhost, graceful_stop.sh uses > /bin/hostname. > > [https://github.com/apache/hbase/blob/cfbae4d3a37e7ac4d795461c3e19406a2786838d/bin/graceful_stop.sh#L106-L110] > When RegionMover strips the host to not include it in the list of target > hosts, we filter it out by checking all RegionServer hosts in the cluster: > > [https://github.com/apache/hbase/blob/branch-2/hbase-server/src/main/java/org/apache/hadoop/hbase/util/RegionMover.java#L382-L384] > > [https://github.com/apache/hbase/blob/cfbae4d3a37e7ac4d795461c3e19406a2786838d/hbase-server/src/main/java/org/apache/hadoop/hbase/util/RegionMover.java#L692] > But the list of RegionServer hosts returned by Admin#getRegionServers are > FDQN, while the hostname provided from graceful_stop.sh is not FDQN, making > the comparison fail. > Same happens for branch-1 region_mover.rb, which is the place I reproduced in > my environment: > [https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L305] > [https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L175] > > [https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L186-L192] > > This can be fixed just by using "/bin/hostname -f" in the graceful_stop.sh > script. > Will provide patch soon. -- This message was sent by Atlassian Jira (v8.3.4#803005)