Re: Data node decommission doesn't seem to be working correctly

2010-05-18 Thread Scott White
Dfsadmin -report reports the hostname for that machine and not the ip. That
machine happens to be the master node which is why I am trying to
decommission the data node there since I only want the data node running on
the slave nodes. Dfs admin -report reports all the ips for the slave nodes.

One question: I believe that the namenode was accidentally restarted during
the 12 hours or so I was waiting for the decommission to complete. Would
this put things into a bad state? I did try running dfsadmin -refreshNodes
after it was restarted.

Scott


On Tue, May 18, 2010 at 5:44 AM, Brian Bockelman wrote:

> Hey Scott,
>
> Hadoop tends to get confused by nodes with multiple hostnames or multiple
> IP addresses.  Is this your case?
>
> I can't remember precisely what our admin does, but I think he puts in the
> IP address which Hadoop listens on in the exclude-hosts file.
>
> Look in the output of
>
> hadoop dfsadmin -report
>
> to determine precisely which IP address your datanode is listening on.
>
> Brian
>
> On May 17, 2010, at 11:32 PM, Scott White wrote:
>
> > I followed the steps mentioned here:
> > http://developer.yahoo.com/hadoop/tutorial/module2.html#decommission to
> > decommission a data node. What I see from the namenode is the hostname of
> > the machine that I decommissioned shows up in both the list of dead nodes
> > but also live nodes where its admin status is marked as 'In Service'.
> It's
> > been twelve hours and there is no sign in the namenode logs that the node
> > has been decommissioned. Any suggestions of what might be the problem and
> > what to try to ensure that this node gets safely taken down?
> >
> > thanks in advance,
> > Scott
>
>


Data node decommission doesn't seem to be working correctly

2010-05-17 Thread Scott White
I followed the steps mentioned here:
http://developer.yahoo.com/hadoop/tutorial/module2.html#decommission to
decommission a data node. What I see from the namenode is the hostname of
the machine that I decommissioned shows up in both the list of dead nodes
but also live nodes where its admin status is marked as 'In Service'. It's
been twelve hours and there is no sign in the namenode logs that the node
has been decommissioned. Any suggestions of what might be the problem and
what to try to ensure that this node gets safely taken down?

thanks in advance,
Scott