[ https://issues.apache.org/jira/browse/HADOOP-19180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17870318#comment-17870318 ]
ASF GitHub Bot commented on HADOOP-19180: ----------------------------------------- zhengchenyu opened a new pull request, #6813: URL: https://github.com/apache/hadoop/pull/6813 ### Description of PR I found that if the erasedIndexes distribution is such that the parity index is in front of the data index, ec will produce wrong results when decoding. In fact, [HDFS-15186](https://issues.apache.org/jira/browse/HDFS-15186) has described this problem, but does not fundamentally solve it. The reason is that the code assumes that erasedIndexes is preceded by the data index and followed by parity index. If there is a parity index placed in front of the data index in the incoming code, a calculation error will occur. When we decode the data unit, we multiply the existing data by the decoding matrix. (Look at the formula [doc](https://zhengchenyu.github.io/2024/05/17/ErasuceCode%E7%AE%97%E6%B3%95%E5%AE%9E%E7%8E%B0/) in 1.2) When we decode the parity unit, we multiply the existing data by the decoding matrix, get data unit, then multiply by encoding matrix. (Look at the formula [doc](https://zhengchenyu.github.io/2024/05/17/ErasuceCode%E7%AE%97%E6%B3%95%E5%AE%9E%E7%8E%B0/) in 1.1 and 1.2 ) The calculations for parity and block are different. But They calculate in two separate loops, then the code requires that the data index must precede the parity index. ### How was this patch tested? The TestErasureCodingEncodeAndDecode unit test and the erasure_code_test binary were executed on different machines. The test machines include those with isa-l installed and those without isa-l installed. ### For code changes: - Make erasedIndexes support arbitrary index distribution. > EC: Fix calculation errors caused by special index order > -------------------------------------------------------- > > Key: HADOOP-19180 > URL: https://issues.apache.org/jira/browse/HADOOP-19180 > Project: Hadoop Common > Issue Type: Bug > Reporter: Chenyu Zheng > Assignee: Chenyu Zheng > Priority: Critical > Labels: pull-request-available > > I found that if the erasedIndexes distribution is such that the parity index > is in front of the data index, ec will produce wrong results when decoding. > In fact, HDFS-15186 has described this problem, but does not fundamentally > solve it. > The reason is that the code assumes that erasedIndexes is preceded by the > data index and followed by parity index. If there is a parity index placed in > front of the data index, a calculation error will occur. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org