Mike, you might want to look at -move option in fsck.

bash-3.00$ hadoop fsck
Usage: DFSck <path> [-move | -delete | -openforwrite] [-files [-blocks
[-locations | -racks]]]
        <path>  start checking from this path
        -move   move corrupted files to /lost+found
        -delete delete corrupted files
        -files  print out files being checked
        -openforwrite   print out files opened for write
        -blocks print out block report
        -locations      print out locations for every block
        -racks  print out network topology for data-node locations



I never use it since I would rather have users' jobs fail than jobs
succeeding with incomplete inputs.

Koji


-----Original Message-----
From: Aaron Kimball [mailto:aa...@cloudera.com] 
Sent: Thursday, March 26, 2009 9:41 AM
To: core-user@hadoop.apache.org
Subject: Re: corrupt unreplicated block in dfs (0.18.3)

Just because a block is corrupt doesn't mean the entire file is corrupt.
Furthermore, the presence/absence of a file in the namespace is a
completely
separate issue to the data in the file. I think it would be a surprising
interface change if files suddenly disappeared just because 1 out of
potentially many blocks were corrupt.

- Aaron

On Thu, Mar 26, 2009 at 1:21 PM, Mike Andrews <m...@xoba.com> wrote:

> i noticed that when a file with no replication (i.e., replication=1)
> develops a corrupt block, hadoop takes no action aside from the
> datanode throwing an exception to the client trying to read the file.
> i manually corrupted a block in order to observe this.
>
> obviously, with replication=1 its impossible to fix the block, but i
> thought perhaps hadoop would take some other action, such as deleting
> the file outright, or moving it to a "corrupt" directory, or marking
> it or keeping track of it somehow to note that there's un-fixable
> corruption in the filesystem? thus, the current behaviour seems to
> sweep the corruption under the rug and allows its continued existence,
> aside from notifying the specific client doing the read with an
> exception.
>
> if anyone has any information about this issue or how to work around
> it, please let me know.
>
> on the other hand, i tested that corrupting a block in a replication=3
> file causes hadoop to re-replicate the block from another existing
> copy, which is good and is i what i expected.
>
> best,
> mike
>
>
> --
> permanent contact information at http://mikerandrews.com
>

Reply via email to