[ https://issues.apache.org/jira/browse/HDFS-17002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721174#comment-17721174 ]
farmmamba commented on HDFS-17002: ---------------------------------- Hi, [~sodonnell] , thanks for your reply. I have done some tests about this cases as follows. Suppose we use RS-6-3-1024K ec policy, we have (d1, d2, d3, d4, d5, d6, r1, r2, r3) of file test.txt 1、echo 0 > r1; echo 0 > r2; echo 0 > r3. 2、hdfs dfs -cat test.txt 3、fsck test.txt I found the r1,r2,r3 parity blocks are still old. that is to say, when reading ec file, it will not trigger parity block reconstruction soonly. > Erasure coding:Generate parity blocks in time to prevent file corruption > ------------------------------------------------------------------------ > > Key: HDFS-17002 > URL: https://issues.apache.org/jira/browse/HDFS-17002 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding > Affects Versions: 3.4.0 > Reporter: farmmamba > Priority: Major > > In current EC implementation, the corrupted parity block will not be > regenerated in time. > Think about below scene when using RS-6-3-1024k EC policy: > If three parity blocks p1, p2, p3 are all corrupted or deleted, we are not > aware of it. > Unfortunately, a data block is also corrupted in this time period, then this > file will be corrupted and can not be read by decoding. > > So, here we should always re-generate parity block in time when it is > unhealthy. > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org