Alyssa, I am not trying to revive the dead node. I want to permanently remove a node from the cluster. But after decommissioning it, it shows up as a dead node until I restart the cluster. I am looking for a way to get rid of it from the dfshealth.jsp page without having to restart the cluster.
Bill On Thu, Jan 29, 2009 at 5:45 PM, Hargraves, Alyssa <aly...@wpi.edu> wrote: > Bill- > > I believe once the node is decommissioned you'll also have to run > bin/hadoop-daemon.sh start datanode and bin/hadoop-daemon.sh start > tasktracker (both run on the slave node, not master) to revive the dead > node. Just removing it from exclude and refreshing doesn't work for me > either, but with those two additional commands it does. > > - Alyssa > ________________________________________ > From: Bill Au [bill.w...@gmail.com] > Sent: Thursday, January 29, 2009 5:40 PM > To: core-user@hadoop.apache.org > Subject: Re: decommissioned node showing up ad dead node in web based > interface to namenode (dfshealth.jsp) > > Not sure why but this does not work for me. I am running 0.18.2. I ran > hadoop dfsadmin -refreshNodes after removing the decommissioned node from > the exclude file. It still shows up as a dead node. I also removed it > from > the slaves file and ran the refresh nodes command again. It still shows up > as a dead node after that. > > I am going to upgrade to 0.19.0 to see if it makes any difference. > > Bill > > On Tue, Jan 27, 2009 at 7:01 PM, paul <paulg...@gmail.com> wrote: > > > Once the nodes are listed as dead, if you still have the host names in > your > > conf/exclude file, remove the entries and then run hadoop dfsadmin > > -refreshNodes. > > > > > > This works for us on our cluster. > > > > > > > > -paul > > > > > > On Tue, Jan 27, 2009 at 5:08 PM, Bill Au <bill.w...@gmail.com> wrote: > > > > > I was able to decommission a datanode successfully without having to > stop > > > my > > > cluster. But I noticed that after a node has been decommissioned, it > > shows > > > up as a dead node in the web base interface to the namenode (ie > > > dfshealth.jsp). My cluster is relatively small and losing a datanode > > will > > > have performance impact. So I have a need to monitor the health of my > > > cluster and take steps to revive any dead datanode in a timely fashion. > > So > > > is there any way to altogether "get rid of" any decommissioned datanode > > > from > > > the web interace of the namenode? Or is there a better way to monitor > > the > > > health of the cluster? > > > > > > Bill > > > > > >