Hey all,
I have a problem with invalid blocks which Hadoop doesn't seem to
realize that are invalid.
For some reason, a lot of our blocks got truncated to 192KB
(HADOOP-4543). When I try to drain off nodes, Hadoop tries to
replicate these blocks with the truncated blocks as the source. The
namenode correctly realizes that the destination copy is incorrectly
sized, but does not take any action to remove the invalid source
copy. Hence, the process just repeats and repeats.
I know the list of the incorrectly sized blocks, and I certainly want
to fix my cluster before a patch is written, fixed, tested, and
released. How do I tell the namenode to forget about the source
blocks or at least trigger a validation attempt?
If I have a datanode which is suspicious, how do I tell Hadoop to
verify *all* blocks on that node? I know validations happen slowly in
the background, but I don't want to wait weeks before I find out that
a block has a problem.
Brian