Check out fsck
bin/hadoop fsck <path> -files -location -blocks
Sriram Rao wrote:
By "scrub" I mean, have a tool that reads every block on a given data
node. That way, I'd be able to find corrupted blocks proactively
rather than having an app read the file and find it.
Sriram
On Wed, Jan 28, 2009 at 5:57 PM, Aaron Kimball <aa...@cloudera.com> wrote:
By "scrub" do you mean delete the blocks from the node?
Read your conf/hadoop-site.xml file to determine where dfs.data.dir points,
then for each directory in that list, just rm the directory. If you want to
ensure that your data is preserved with appropriate replication levels on
the rest of your clutser, you should use Hadoop's DataNode Decommission
feature to up-replicate the data before you blow a copy away.
- Aaron
On Wed, Jan 28, 2009 at 2:10 PM, Sriram Rao <srirams...@gmail.com> wrote:
Hi,
Is there a tool that one could run on a datanode to scrub all the
blocks on that node?
Sriram