wuyi created SPARK-35275:
----------------------------

             Summary: Add checksum for shuffle blocks
                 Key: SPARK-35275
                 URL: https://issues.apache.org/jira/browse/SPARK-35275
             Project: Spark
          Issue Type: New Feature
          Components: Spark Core
    Affects Versions: 3.2.0
            Reporter: wuyi


Shuffle data corruption is a long-standing issue in Spark. For example, inĀ 
SPARK-18105, people continually reports corruption issue. However, data 
corruption is difficult to reproduce in most cases and even harder to tell the 
root cause. We don't know if it's a Spark issue or not. With the checksum 
support for the shuffle, Spark itself can at least distinguish the cause 
between disk and network, which is very important for users.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to