Re: tools for scrubbing HDFS data nodes?

Sagar Naik Wed, 28 Jan 2009 18:20:56 -0800

Check out fsck

bin/hadoop fsck <path> -files -location -blocks


Sriram Rao wrote:

By "scrub" I mean, have a tool that reads every block on a given data
node.  That way, I'd be able to find corrupted blocks proactively
rather than having an app read the file and find it.

Sriram

On Wed, Jan 28, 2009 at 5:57 PM, Aaron Kimball <aa...@cloudera.com> wrote:

By "scrub" do you mean delete the blocks from the node?

Read your conf/hadoop-site.xml file to determine where dfs.data.dir points,
then for each directory in that list, just rm the directory. If you want to
ensure that your data is preserved with appropriate replication levels on
the rest of your clutser, you should use Hadoop's DataNode Decommission
feature to up-replicate the data before you blow a copy away.

- Aaron

On Wed, Jan 28, 2009 at 2:10 PM, Sriram Rao <srirams...@gmail.com> wrote:

Hi,

Is there a tool that one could run on a datanode to scrub all the
blocks on that node?

Sriram

Re: tools for scrubbing HDFS data nodes?

Reply via email to