[jira] [Commented] (HADOOP-19180) EC: Fix calculation errors caused by special index order

ASF GitHub Bot (Jira) Thu, 01 Aug 2024 19:20:03 -0700


    [ 
https://issues.apache.org/jira/browse/HADOOP-19180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17870318#comment-17870318
 ]


ASF GitHub Bot commented on HADOOP-19180:
-----------------------------------------

zhengchenyu opened a new pull request, #6813:
URL: https://github.com/apache/hadoop/pull/6813

   ### Description of PR
   
   I found that if the erasedIndexes distribution is such that the parity index 
is in front of the data index, ec will produce wrong results when decoding.
   
   In fact, [HDFS-15186](https://issues.apache.org/jira/browse/HDFS-15186) has 
described this problem, but does not fundamentally solve it.
   
   The reason is that the code assumes that erasedIndexes is preceded by the 
data index and followed by parity index. If there is a parity index placed in 
front of the data index in the incoming code, a calculation error will occur.
   
   When we decode the data unit, we multiply the existing data by the decoding 
matrix. (Look at the formula 
[doc](https://zhengchenyu.github.io/2024/05/17/ErasuceCode%E7%AE%97%E6%B3%95%E5%AE%9E%E7%8E%B0/)
 in 1.2)
   When we decode the parity unit, we multiply the existing data by the 
decoding matrix, get data unit, then multiply by encoding matrix. (Look at the 
formula 
[doc](https://zhengchenyu.github.io/2024/05/17/ErasuceCode%E7%AE%97%E6%B3%95%E5%AE%9E%E7%8E%B0/)
  in 1.1 and 1.2 )
   The calculations for parity and block are different. But They calculate in 
two separate loops, then the code requires that the data index must precede the 
parity index.
   
   ### How was this patch tested?
   
   The TestErasureCodingEncodeAndDecode unit test and the erasure_code_test 
binary were executed on different machines. The test machines include those 
with isa-l installed and those without isa-l installed.
   
   ### For code changes:
   
   - Make erasedIndexes support arbitrary index distribution.
   
   




> EC: Fix calculation errors caused by special index order
> --------------------------------------------------------
>
>                 Key: HADOOP-19180
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19180
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Chenyu Zheng
>            Assignee: Chenyu Zheng
>            Priority: Critical
>              Labels: pull-request-available
>
> I found that if the erasedIndexes distribution is such that the parity index 
> is in front of the data index, ec will produce wrong results when decoding.
> In fact, HDFS-15186 has described this problem, but does not fundamentally 
> solve it.
> The reason is that the code assumes that erasedIndexes is preceded by the 
> data index and followed by parity index. If there is a parity index placed in 
> front of the data index, a calculation error will occur.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-19180) EC: Fix calculation errors caused by special index order

Reply via email to