[ 
https://issues.apache.org/jira/browse/HDFS-17002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721174#comment-17721174
 ] 

farmmamba commented on HDFS-17002:
----------------------------------

Hi, [~sodonnell] , thanks for your reply.  I have done some tests about this 
cases as follows.

Suppose we use RS-6-3-1024K ec policy, we have (d1, d2, d3, d4, d5, d6, r1, r2, 
r3) of file test.txt

1、echo 0 > r1;  echo 0 > r2; echo 0 > r3.

2、hdfs dfs -cat test.txt

3、fsck test.txt

I found  the r1,r2,r3 parity blocks are still old.  that is to say, when 
reading ec file, it will not trigger parity block reconstruction soonly.

 

> Erasure coding:Generate parity blocks in time to prevent file corruption
> ------------------------------------------------------------------------
>
>                 Key: HDFS-17002
>                 URL: https://issues.apache.org/jira/browse/HDFS-17002
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: erasure-coding
>    Affects Versions: 3.4.0
>            Reporter: farmmamba
>            Priority: Major
>
> In current EC implementation, the corrupted parity block will not be 
> regenerated in time. 
> Think about below scene when using RS-6-3-1024k EC policy:
> If three parity blocks p1, p2, p3 are all corrupted or deleted, we are not 
> aware of it.
> Unfortunately, a data block is also corrupted in this time period,  then this 
> file will be corrupted and can not be read by decoding.
>  
> So, here we should always re-generate parity block in time when it is 
> unhealthy.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to