Thanks for the reply, I am using Hadoop-0.20. We installed from Apache not cloundera, if that makes a difference.
Currently I really need to know how to get the data that was replicated during decommissioning back onto my two data nodes. On Thursday, July 24, 2014, Stanley Shi <s...@gopivotal.com> wrote: > which distribution are you using? > > Regards, > *Stanley Shi,* > > > > On Thu, Jul 24, 2014 at 4:38 AM, andrew touchet <adt...@latech.edu > <javascript:_e(%7B%7D,'cvml','adt...@latech.edu');>> wrote: > >> I should have added this in my first email but I do get an error in the >> data node's log file >> >> '2014-07-12 19:39:58,027 INFO >> org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 0 blocks >> got processed in 1 msecs' >> >> >> >> On Wed, Jul 23, 2014 at 3:18 PM, andrew touchet <adt...@latech.edu >> <javascript:_e(%7B%7D,'cvml','adt...@latech.edu');>> wrote: >> >>> Hello, >>> >>> I am Decommissioning data nodes for an OS upgrade on a HPC cluster . >>> Currently, users can run jobs that use data stored on /hdfs. They are able >>> to access all datanodes/compute nodes except the one being decommissioned. >>> >>> Is this safe to do? Will edited files affect the decommissioning node? >>> >>> I've been adding the nodes to /usr/lib/hadoop-0.20/conf/hosts_exclude >>> and running 'hadoop dfsadmin -refreshNodes' on the name name node. Then >>> I simply wait for log files to report completion. After upgrade, I simply >>> remove the node from hosts_exlude and start hadoop again on the datanode. >>> >>> Also: Under the namenode web interface I just noticed that the node I >>> have decommissioned previously now has 0 Configured capacity, Used, >>> Remaining memory and is now 100% Used. >>> >>> I used the same /etc/sysconfig/hadoop file from before the upgrade, >>> removed the node from hosts_exclude, and ran '-refreshNodes' afterwards. >>> >>> What steps have I missed in the decommissioning process or while >>> bringing the data node back online? >>> >>> >>> >>> >> >