[ 
https://issues.apache.org/jira/browse/SPARK-48694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Ma updated SPARK-48694:
---------------------------
    Description: 
We have a scenario that use Spark together with a 3rd party file source cache, 
which is an independent lib and has its internal logic for cache entry 
creation, eviction and remove. Currently we allocate dedicated memory for this 
cache but the problem is that the memory can't be shared along with Spark 
execution/storage memory. It will be more effective for memory usage if we can 
count it in the _*UnifiedMemoryManager*_ and include it in memory spill logic.

We also have a requirement of memory management for a native RDD cache 
implementation. The existing interfaces in _*MemoryStore*_ is generally bound 
with Spark SerializerManager and BlockEvictionHandler. It's not easy to extend 
for such customized RDD cache.

For both of above requirements, we can provide a common memory register for 
external memory usage, beyond current RDD cache in storage memory. 

> Manage memory used by external cache
> ------------------------------------
>
>                 Key: SPARK-48694
>                 URL: https://issues.apache.org/jira/browse/SPARK-48694
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 3.3.1, 3.4.0, 3.5.1
>            Reporter: Yan Ma
>            Priority: Critical
>
> We have a scenario that use Spark together with a 3rd party file source 
> cache, which is an independent lib and has its internal logic for cache entry 
> creation, eviction and remove. Currently we allocate dedicated memory for 
> this cache but the problem is that the memory can't be shared along with 
> Spark execution/storage memory. It will be more effective for memory usage if 
> we can count it in the _*UnifiedMemoryManager*_ and include it in memory 
> spill logic.
> We also have a requirement of memory management for a native RDD cache 
> implementation. The existing interfaces in _*MemoryStore*_ is generally bound 
> with Spark SerializerManager and BlockEvictionHandler. It's not easy to 
> extend for such customized RDD cache.
> For both of above requirements, we can provide a common memory register for 
> external memory usage, beyond current RDD cache in storage memory. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to