The failover is fine; we are more interested in finding corrupt blocks
sooner rather than later.  Since there is the thread in the datanode,
that is good.

The replication factor is 3.

Sriram

On Wed, Jan 28, 2009 at 6:45 PM, Sagar Naik <sn...@attributor.com> wrote:
> In addition to datanode itself finding corrupted blocks (As Owen mention)
> if the  client finds a corrupted - block, it will go to other replica
>
> Whts your replication factor ?
>
> -Sagar
>
> Sriram Rao wrote:
>>
>> Does this read every block of every file from all replicas and verify
>> that the checksums are good?
>>
>> Sriram
>>
>> On Wed, Jan 28, 2009 at 6:20 PM, Sagar Naik <sn...@attributor.com> wrote:
>>
>>>
>>> Check out fsck
>>>
>>> bin/hadoop fsck <path> -files -location -blocks
>>>
>>> Sriram Rao wrote:
>>>
>>>>
>>>> By "scrub" I mean, have a tool that reads every block on a given data
>>>> node.  That way, I'd be able to find corrupted blocks proactively
>>>> rather than having an app read the file and find it.
>>>>
>>>> Sriram
>>>>
>>>> On Wed, Jan 28, 2009 at 5:57 PM, Aaron Kimball <aa...@cloudera.com>
>>>> wrote:
>>>>
>>>>
>>>>>
>>>>> By "scrub" do you mean delete the blocks from the node?
>>>>>
>>>>> Read your conf/hadoop-site.xml file to determine where dfs.data.dir
>>>>> points,
>>>>> then for each directory in that list, just rm the directory. If you
>>>>> want
>>>>> to
>>>>> ensure that your data is preserved with appropriate replication levels
>>>>> on
>>>>> the rest of your clutser, you should use Hadoop's DataNode Decommission
>>>>> feature to up-replicate the data before you blow a copy away.
>>>>>
>>>>> - Aaron
>>>>>
>>>>> On Wed, Jan 28, 2009 at 2:10 PM, Sriram Rao <srirams...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Is there a tool that one could run on a datanode to scrub all the
>>>>>> blocks on that node?
>>>>>>
>>>>>> Sriram
>>>>>>
>>>>>>
>>>>>>
>

Reply via email to