[ 
https://issues.apache.org/jira/browse/HDFS-7708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15134449#comment-15134449
 ] 

Allen Wittenauer commented on HDFS-7708:
----------------------------------------

The bash code in trunk allows for major parts of it to be replaced by the user. 
 There is no guarantee at the Java level that the pid files being touched here 
are actually the pid files it wants.  There's also the problem that if someone 
launches multiple balancers to different clusters on the same machine with the 
same PID dir (easily accomplished out of the box by using the 
HADOOP_IDENT_STRING component), this will actually remove all of them that it 
has permission to remove.

So not only will this not work for various edge cases, it's actually pretty 
disastrous.



> Balancer should delete its pid file when it completes rebalance
> ---------------------------------------------------------------
>
>                 Key: HDFS-7708
>                 URL: https://issues.apache.org/jira/browse/HDFS-7708
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: balancer & mover
>    Affects Versions: 2.6.0
>            Reporter: Akira AJISAKA
>            Assignee: Rakesh R
>         Attachments: HDFS-7708-002.patch, HDFS-7708.patch
>
>
> When balancer completes rebalance and exits, it does not delete its pid file. 
> Starting balancer again, then "kill -0 pid" to confirm the balancer process 
> is not running.
> The problem is: If another process is running as the same pid as `cat 
> pidfile`, balancer fails to start with following message:
> {code}
>   balancer is running as process 3443. Stop it first.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to