[jira] [Resolved] (HBASE-25594) graceful_stop.sh fails to unload regions when ran at localhost

2021-03-20 Thread Michael Stack (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Stack resolved HBASE-25594.
---
Resolution: Fixed

Pushed addendum on branch-2.4+

> graceful_stop.sh fails to unload regions when ran at localhost
> --
>
> Key: HBASE-25594
> URL: https://issues.apache.org/jira/browse/HBASE-25594
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha-1, 1.4.13
>Reporter: Javier Akira Luca de Tena
>Assignee: Javier Akira Luca de Tena
>Priority: Minor
> Fix For: 3.0.0-alpha-1, 2.5.0, 2.4.3
>
>
> We usually use graceful_stop.sh from the Master to restart RegionServers. 
> However, in some scenarios we may not have privileges to restart remote 
> RegionServers (it uses ssh).
>  But we can still use graceful_stop.sh on the same host we want to restart.
> In order to detect the execution at localhost, graceful_stop.sh uses 
> /bin/hostname.
>  
> [https://github.com/apache/hbase/blob/cfbae4d3a37e7ac4d795461c3e19406a2786838d/bin/graceful_stop.sh#L106-L110]
> When RegionMover strips the host to not include it in the list of target 
> hosts, we filter it out by checking all RegionServer hosts in the cluster:
>  
> [https://github.com/apache/hbase/blob/branch-2/hbase-server/src/main/java/org/apache/hadoop/hbase/util/RegionMover.java#L382-L384]
>  
> [https://github.com/apache/hbase/blob/cfbae4d3a37e7ac4d795461c3e19406a2786838d/hbase-server/src/main/java/org/apache/hadoop/hbase/util/RegionMover.java#L692]
> But the list of RegionServer hosts returned by Admin#getRegionServers are 
> FDQN, while the hostname provided from graceful_stop.sh is not FDQN, making 
> the comparison fail.
> Same happens for branch-1 region_mover.rb, which is the place I reproduced in 
> my environment: 
> [https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L305]
> [https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L175]
>  
> [https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L186-L192]
>  
> This can be fixed just by using "/bin/hostname -f" in the graceful_stop.sh 
> script.
> Will provide patch soon.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-25594) graceful_stop.sh fails to unload regions when ran at localhost

2021-03-19 Thread Peter Somogyi (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Somogyi resolved HBASE-25594.
---
Resolution: Fixed

Reverted the commit and reapplied with the correct commit message.

> graceful_stop.sh fails to unload regions when ran at localhost
> --
>
> Key: HBASE-25594
> URL: https://issues.apache.org/jira/browse/HBASE-25594
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha-1, 1.4.13
>Reporter: Javier Akira Luca de Tena
>Assignee: Javier Akira Luca de Tena
>Priority: Minor
> Fix For: 3.0.0-alpha-1, 2.5.0, 2.4.3
>
>
> We usually use graceful_stop.sh from the Master to restart RegionServers. 
> However, in some scenarios we may not have privileges to restart remote 
> RegionServers (it uses ssh).
>  But we can still use graceful_stop.sh on the same host we want to restart.
> In order to detect the execution at localhost, graceful_stop.sh uses 
> /bin/hostname.
>  
> [https://github.com/apache/hbase/blob/cfbae4d3a37e7ac4d795461c3e19406a2786838d/bin/graceful_stop.sh#L106-L110]
> When RegionMover strips the host to not include it in the list of target 
> hosts, we filter it out by checking all RegionServer hosts in the cluster:
>  
> [https://github.com/apache/hbase/blob/branch-2/hbase-server/src/main/java/org/apache/hadoop/hbase/util/RegionMover.java#L382-L384]
>  
> [https://github.com/apache/hbase/blob/cfbae4d3a37e7ac4d795461c3e19406a2786838d/hbase-server/src/main/java/org/apache/hadoop/hbase/util/RegionMover.java#L692]
> But the list of RegionServer hosts returned by Admin#getRegionServers are 
> FDQN, while the hostname provided from graceful_stop.sh is not FDQN, making 
> the comparison fail.
> Same happens for branch-1 region_mover.rb, which is the place I reproduced in 
> my environment: 
> [https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L305]
> [https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L175]
>  
> [https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L186-L192]
>  
> This can be fixed just by using "/bin/hostname -f" in the graceful_stop.sh 
> script.
> Will provide patch soon.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-25594) graceful_stop.sh fails to unload regions when ran at localhost

2021-03-18 Thread Michael Stack (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Stack resolved HBASE-25594.
---
Resolution: Fixed

I pushed the below to 2.4+
{code}
I pushed this to 2.3+

commit 728d4f5ab12fd2631b1ef0a7c61203e9acfb05f0 (HEAD -> 2.3, origin/branch-2.3)
Author: Javier Akira Luca de Tena 
Date:   Fri Mar 19 04:04:54 2021 +0900

HBOPS-25594 Make easier to use graceful_stop on localhost mode (#3054)

Co-authored-by: Javier 

diff --git a/bin/graceful_stop.sh b/bin/graceful_stop.sh
index 89e3dd939c..e565929606 100755
--- a/bin/graceful_stop.sh
+++ b/bin/graceful_stop.sh
@@ -32,7 +32,7 @@ moving regions"
   echo " maxthreads xx  Limit the number of threads used by the region mover. 
Default value is 1."
   echo " movetimeout xx Timeout for moving regions. If regions are not moved 
by the timeout value,\
 exit with error. Default value is INT_MAX."
-  echo " hostname   Hostname of server we are to stop"
+  echo " hostname   Hostname to stop; match what HBase uses; pass 
'localhost' if local to avoid ssh"
   echo " e|failfast Set -e so exit immediately if any command exits with 
non-zero status"
   echo " nob| nobalancer Do not manage balancer states. This is only used as 
optimization in \
 rolling_restart.sh to avoid multiple calls to hbase shell"
@@ -100,6 +100,10 @@ localhostname=`/bin/hostname`
 if [ "$localhostname" == "$hostname" ]; then
   local=true
 fi
+if [ "$localhostname" == "$hostname" ] || [ "$hostname" == "localhost" ]; then
+  local=true
+  hostname=$localhostname
+fi

 if [ "$nob" == "true"  ]; then
   log "[ $0 ] skipping disabling balancer -nob argument is used"
{code}

> graceful_stop.sh fails to unload regions when ran at localhost
> --
>
> Key: HBASE-25594
> URL: https://issues.apache.org/jira/browse/HBASE-25594
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha-1, 1.4.13
>Reporter: Javier Akira Luca de Tena
>Assignee: Javier Akira Luca de Tena
>Priority: Minor
> Fix For: 3.0.0-alpha-1, 2.5.0, 2.4.3
>
>
> We usually use graceful_stop.sh from the Master to restart RegionServers. 
> However, in some scenarios we may not have privileges to restart remote 
> RegionServers (it uses ssh).
>  But we can still use graceful_stop.sh on the same host we want to restart.
> In order to detect the execution at localhost, graceful_stop.sh uses 
> /bin/hostname.
>  
> [https://github.com/apache/hbase/blob/cfbae4d3a37e7ac4d795461c3e19406a2786838d/bin/graceful_stop.sh#L106-L110]
> When RegionMover strips the host to not include it in the list of target 
> hosts, we filter it out by checking all RegionServer hosts in the cluster:
>  
> [https://github.com/apache/hbase/blob/branch-2/hbase-server/src/main/java/org/apache/hadoop/hbase/util/RegionMover.java#L382-L384]
>  
> [https://github.com/apache/hbase/blob/cfbae4d3a37e7ac4d795461c3e19406a2786838d/hbase-server/src/main/java/org/apache/hadoop/hbase/util/RegionMover.java#L692]
> But the list of RegionServer hosts returned by Admin#getRegionServers are 
> FDQN, while the hostname provided from graceful_stop.sh is not FDQN, making 
> the comparison fail.
> Same happens for branch-1 region_mover.rb, which is the place I reproduced in 
> my environment: 
> [https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L305]
> [https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L175]
>  
> [https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L186-L192]
>  
> This can be fixed just by using "/bin/hostname -f" in the graceful_stop.sh 
> script.
> Will provide patch soon.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-25594) graceful_stop.sh fails to unload regions when ran at localhost

2021-03-15 Thread Michael Stack (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Stack resolved HBASE-25594.
---
Fix Version/s: 2.4.3
   2.5.0
   3.0.0-alpha-1
 Hadoop Flags: Reviewed
   Resolution: Fixed

I pushed your PR to 2.4+ [~akiraluca] (Added the doc changes from HBASE-25663 
here when I pushed). Thanks for the PR. Shout if you need it to go back further.

> graceful_stop.sh fails to unload regions when ran at localhost
> --
>
> Key: HBASE-25594
> URL: https://issues.apache.org/jira/browse/HBASE-25594
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha-1, 1.4.13
>Reporter: Javier Akira Luca de Tena
>Assignee: Javier Akira Luca de Tena
>Priority: Minor
> Fix For: 3.0.0-alpha-1, 2.5.0, 2.4.3
>
>
> We usually use graceful_stop.sh from the Master to restart RegionServers. 
> However, in some scenarios we may not have privileges to restart remote 
> RegionServers (it uses ssh).
>  But we can still use graceful_stop.sh on the same host we want to restart.
> In order to detect the execution at localhost, graceful_stop.sh uses 
> /bin/hostname.
>  
> [https://github.com/apache/hbase/blob/cfbae4d3a37e7ac4d795461c3e19406a2786838d/bin/graceful_stop.sh#L106-L110]
> When RegionMover strips the host to not include it in the list of target 
> hosts, we filter it out by checking all RegionServer hosts in the cluster:
>  
> [https://github.com/apache/hbase/blob/branch-2/hbase-server/src/main/java/org/apache/hadoop/hbase/util/RegionMover.java#L382-L384]
>  
> [https://github.com/apache/hbase/blob/cfbae4d3a37e7ac4d795461c3e19406a2786838d/hbase-server/src/main/java/org/apache/hadoop/hbase/util/RegionMover.java#L692]
> But the list of RegionServer hosts returned by Admin#getRegionServers are 
> FDQN, while the hostname provided from graceful_stop.sh is not FDQN, making 
> the comparison fail.
> Same happens for branch-1 region_mover.rb, which is the place I reproduced in 
> my environment: 
> [https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L305]
> [https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L175]
>  
> [https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L186-L192]
>  
> This can be fixed just by using "/bin/hostname -f" in the graceful_stop.sh 
> script.
> Will provide patch soon.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-25594) graceful_stop.sh fails to unload regions when ran at localhost

2021-03-15 Thread Javier Akira Luca de Tena (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Javier Akira Luca de Tena resolved HBASE-25594.
---
Resolution: Duplicate

> graceful_stop.sh fails to unload regions when ran at localhost
> --
>
> Key: HBASE-25594
> URL: https://issues.apache.org/jira/browse/HBASE-25594
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha-1, 1.4.13
>Reporter: Javier Akira Luca de Tena
>Assignee: Javier Akira Luca de Tena
>Priority: Minor
>
> We usually use graceful_stop.sh from the Master to restart RegionServers. 
> However, in some scenarios we may not have privileges to restart remote 
> RegionServers (it uses ssh).
>  But we can still use graceful_stop.sh on the same host we want to restart.
> In order to detect the execution at localhost, graceful_stop.sh uses 
> /bin/hostname.
>  
> [https://github.com/apache/hbase/blob/cfbae4d3a37e7ac4d795461c3e19406a2786838d/bin/graceful_stop.sh#L106-L110]
> When RegionMover strips the host to not include it in the list of target 
> hosts, we filter it out by checking all RegionServer hosts in the cluster:
>  
> [https://github.com/apache/hbase/blob/branch-2/hbase-server/src/main/java/org/apache/hadoop/hbase/util/RegionMover.java#L382-L384]
>  
> [https://github.com/apache/hbase/blob/cfbae4d3a37e7ac4d795461c3e19406a2786838d/hbase-server/src/main/java/org/apache/hadoop/hbase/util/RegionMover.java#L692]
> But the list of RegionServer hosts returned by Admin#getRegionServers are 
> FDQN, while the hostname provided from graceful_stop.sh is not FDQN, making 
> the comparison fail.
> Same happens for branch-1 region_mover.rb, which is the place I reproduced in 
> my environment: 
> [https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L305]
> [https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L175]
>  
> [https://github.com/apache/hbase/blob/f9a91488b2c39320bed502619bf7adb765c79de6/bin/region_mover.rb#L186-L192]
>  
> This can be fixed just by using "/bin/hostname -f" in the graceful_stop.sh 
> script.
> Will provide patch soon.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)