HDFS data after nodes become unavailable?

2015-02-25 Thread tesm...@gmail.com
Dear all,

I have transferred the data from local storage to HDFS in my 10 nodes
Hadoop cluster. The relication facotr is 3.

Some nodes, say 3,  are not available after some time. I can't use those
nodes for computation or storage of data.

What will happen to the data stored on HDFS of those nodes?

Do I need to remvoe all the data from HDFS and copy it again?

Regards,


Re: HDFS data after nodes become unavailable?

2015-02-25 Thread Rajesh Kartha
Do you know why the 3 nodes are down ? With replication,  the copy of data
that were hosted on those failed nodes will not be available. However, the
data will still be served by the hosts having the other 2 copies - so I
don't think you need to copy the data again.

Unless for some reason the 3 copies of some data ended up on these nodes,
in which case those will not be available

Maybe you could do a ' hadoop fsck /'  to confirm if the HDFS is healthy.

-Rajesh

On Wed, Feb 25, 2015 at 9:21 AM, tesm...@gmail.com tesm...@gmail.com
wrote:

 Dear all,

 I have transferred the data from local storage to HDFS in my 10 nodes
 Hadoop cluster. The relication facotr is 3.

 Some nodes, say 3,  are not available after some time. I can't use those
 nodes for computation or storage of data.

 What will happen to the data stored on HDFS of those nodes?

 Do I need to remvoe all the data from HDFS and copy it again?

 Regards,