[ 
https://issues.apache.org/jira/browse/SPARK-43300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Rosen reassigned SPARK-43300:
----------------------------------

    Assignee: Ziqi Liu

> Cascade failure in Guava cache due to fate-sharing
> --------------------------------------------------
>
>                 Key: SPARK-43300
>                 URL: https://issues.apache.org/jira/browse/SPARK-43300
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 3.4.0
>            Reporter: Ziqi Liu
>            Assignee: Ziqi Liu
>            Priority: Major
>
> Guava cache is widely used in spark, however, it suffers from fate-sharing 
> behavior: If there are multiple requests trying to access the same key in the 
> {{cache}} at the same time when the key is not in the cache, Guava cache will 
> block all requests and create the object only once. If the creation fails, 
> all requests will fail immediately without retry. So we might see task 
> failure due to irrelevant failure in other queries due to fate sharing.
> This fate sharing behavior might lead to unexpected results in some situation.
> We can wrap around Guava cache with a KeyLock to synchronize all requests 
> with the same key, so they will run individually and fail as if they come one 
> at a time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to