[ https://issues.apache.org/jira/browse/SPARK-43300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Josh Rosen reassigned SPARK-43300: ---------------------------------- Assignee: Ziqi Liu > Cascade failure in Guava cache due to fate-sharing > -------------------------------------------------- > > Key: SPARK-43300 > URL: https://issues.apache.org/jira/browse/SPARK-43300 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 3.4.0 > Reporter: Ziqi Liu > Assignee: Ziqi Liu > Priority: Major > > Guava cache is widely used in spark, however, it suffers from fate-sharing > behavior: If there are multiple requests trying to access the same key in the > {{cache}} at the same time when the key is not in the cache, Guava cache will > block all requests and create the object only once. If the creation fails, > all requests will fail immediately without retry. So we might see task > failure due to irrelevant failure in other queries due to fate sharing. > This fate sharing behavior might lead to unexpected results in some situation. > We can wrap around Guava cache with a KeyLock to synchronize all requests > with the same key, so they will run individually and fail as if they come one > at a time. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org