[jira] [Comment Edited] (HDFS-17002) Erasure coding:Generate parity blocks in time to prevent file corruption

farmmamba (Jira) Tue, 09 May 2023 22:59:06 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-17002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721199#comment-17721199
 ]


farmmamba edited comment on HDFS-17002 at 5/10/23 5:58 AM:
-----------------------------------------------------------

[~ayushtkn] , Yes, sir. This is not a bug. The type of this Jira is just an 
improvement. Yes, client won't  read parity blocks when all data blocks are 
healthy.

When the DirectoryScanner is not working, we know nothing about parity blocks 
even they got screwed.

So, I am thinking about whether we should to sample to check the correctness of 
the parity blocks with some probability when reading ec files or some other 
methods to prevent the parity blocks break down silently.

 

What's your opinions?  Looking forward to your reply.


was (Author: zhanghaobo):
[~ayushtkn] , Yes, sir. This is not a bug. The type of this Jira is just an 
improvement. Yes, client won't  read parity blocks when all data blocks are 
healthy.

When the DirectoryScanner is not working, we know nothing about parity blocks 
even they got screwed.

So, I am thinking about whether we should to sample to check the correctness of 
the parity blocks with some probability when reading ec files or some other 
methods to prevent the parity blocks break down silently.

> Erasure coding:Generate parity blocks in time to prevent file corruption
> ------------------------------------------------------------------------
>
>                 Key: HDFS-17002
>                 URL: https://issues.apache.org/jira/browse/HDFS-17002
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: erasure-coding
>    Affects Versions: 3.4.0
>            Reporter: farmmamba
>            Priority: Major
>
> In current EC implementation, the corrupted parity block will not be 
> regenerated in time. 
> Think about below scene when using RS-6-3-1024k EC policy:
> If three parity blocks p1, p2, p3 are all corrupted or deleted, we are not 
> aware of it.
> Unfortunately, a data block is also corrupted in this time period,  then this 
> file will be corrupted and can not be read by decoding.
>  
> So, here we should always re-generate parity block in time when it is 
> unhealthy.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-17002) Erasure coding:Generate parity blocks in time to prevent file corruption

Reply via email to