In my Hadoop cluster, I've had several drives fail lately (and they've
been replaced).  Each time a new empty drive is placed in the cluster,
I run the balancer.

I understand that the balancer will redistribute the load of file
blocks across the nodes.

My question is: will balancer also look at the desired replication of
a file, and if the actual replication of a file is less than the
desired (because the file had blocks stored on the lost drive), will
balancer re-replicate those lost blocks?

If not, is there another tool that will ensure the desired replication
factor of files is satisfied?

If this functionality doesn't exist, I'm concerned that I'm slowly,
silently losing my files as I replace drives, and I may not even
realize it.

Thoughts?

Reply via email to