RE: [jira] Resolved: (HADOOP-731) Sometimes when a dfs file is accessed and one copy has a checksum error the I/O command fails, even if another copy is alright.

Hairong Kuang Thu, 04 Jan 2007 15:01:58 -0800

A retry needs a close of current data/crc streams, an open of new data/crc
streams, and a seek. This would do. Then we need to add this to the
description of HADOOP-855.


Hairong

-----Original Message-----
From: Sameer Paranjpye [mailto:[EMAIL PROTECTED] 
Sent: Thursday, January 04, 2007 2:51 PM
To: [email protected]
Subject: Re: [jira] Resolved: (HADOOP-731) Sometimes when a dfs file is
accessed and one copy has a checksum error the I/O command fails, even if
another copy is alright.

Shouldn't the fix for HADOOP-855 also include a retry on a different
replica? That was my understanding...

Hairong Kuang wrote:
> I feel that HADOOP-731 is not a duplicate of HADOOP-855. The proposal 
> to
> HADOOP-855 is to report to the namenode to delete the corrupted data 
> block/checksum block. The solution helps the next read get the correct 
> data, but the current read still throws a checksum error and thus 
> fails the cp/get operation that calls read.
> 
> Hairong
> 
> -----Original Message-----
> From: Sameer Paranjpye (JIRA) [mailto:[EMAIL PROTECTED]
> Sent: Thursday, January 04, 2007 1:45 PM
> To: [email protected]
> Subject: [jira] Resolved: (HADOOP-731) Sometimes when a dfs file is 
> accessed and one copy has a checksum error the I/O command fails, even 
> if another copy is alright.
> 
> 
>      [
> https://issues.apache.org/jira/browse/HADOOP-731?page=com.atlassian.ji
> ra.plu gin.system.issuetabpanels:all-tabpanel ]
> 
> Sameer Paranjpye resolved HADOOP-731.
> -------------------------------------
> 
>     Resolution: Duplicate
> 
> Duplicated in HADOOP-855
> 
>> Sometimes when a dfs file is accessed and one copy has a checksum 
>> error
> the I/O command fails, even if another copy is alright.
>> ---------------------------------------------------------------------
>> -
>> ---------------------------------------------------------
>>
>>                 Key: HADOOP-731
>>                 URL: https://issues.apache.org/jira/browse/HADOOP-731
>>             Project: Hadoop
>>          Issue Type: Bug
>>          Components: dfs
>>    Affects Versions: 0.7.2
>>            Reporter: Dick King
>>         Assigned To: Sameer Paranjpye
>>
>> for a particular file [alas, the file no longer exists -- I had to
> progress]
>>     $dfs -cp foo bar        
>> and
>>     $dfs -get foo local
>> failed on a checksum error.  The dfs browser's download function 
>> retrieved
> the file, so either that function doesn't check, or more likely the 
> download function got a different copy.
>> When a checksum fails on one copy of a file that is redundantly 
>> stored, I
> would prefer that dfs try a different copy, mark the bad one as not 
> existing [which should induce a fresh copy being made from one of the 
> good copies eventually], and make the call continue to work and deliver
bytes.
>> Ideally, if all copies have checksum errors but it's possible to 
>> piece
> together a good copy I would like that to be done.
>> -dk
> 
> --
> This message is automatically generated by JIRA.
> -
> If you think it was sent incorrectly contact one of the administrators:
> https://issues.apache.org/jira/secure/Administrators.jspa
> -
> For more information on JIRA, see: 
> http://www.atlassian.com/software/jira
> 
>         
> 
>

RE: [jira] Resolved: (HADOOP-731) Sometimes when a dfs file is accessed and one copy has a checksum error the I/O command fails, even if another copy is alright.

Reply via email to