[ https://issues.apache.org/jira/browse/HDFS-7708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15129769#comment-15129769 ]
Vinayakumar B commented on HDFS-7708: ------------------------------------- bq. However, this approach isn't going to work here. The bash script execs java. That means that any bash commands after the exec are not relevant, since the bash process is replaced by the Java process. If you want a signal handler or an atexit handler, you need to do it from java. IMO, its no need to handle in start-{daemon}.sh script or from java. Each daemon have its own stop script provided along with them. Better to handle removal of pid file corresponding to a daemon there. For ex: For balancer, even if the balancer is running or exited after re-balancing, let the pid file there. On error, stop script can be called, which internally check for the pid file and corresponding command is running or not. If some other process is running with same pid as in pid file, instead of killing the other process, just can remove the pid file. Check of whether the process is same as command we are trying to stop, can be checked, by checking the existence of {{-Dproc_balancer}} in command line. Since this general problem for all daemons. Have seen some other Jiras also facing same problem. > Balancer should delete its pid file when it completes rebalance > --------------------------------------------------------------- > > Key: HDFS-7708 > URL: https://issues.apache.org/jira/browse/HDFS-7708 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover > Affects Versions: 2.6.0 > Reporter: Akira AJISAKA > Assignee: Rakesh R > Attachments: HDFS-7708-002.patch, HDFS-7708.patch > > > When balancer completes rebalance and exits, it does not delete its pid file. > Starting balancer again, then "kill -0 pid" to confirm the balancer process > is not running. > The problem is: If another process is running as the same pid as `cat > pidfile`, balancer fails to start with following message: > {code} > balancer is running as process 3443. Stop it first. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)