[ https://issues.apache.org/jira/browse/SPARK-48694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated SPARK-48694: ----------------------------------- Labels: pull-request-available (was: ) > Manage memory used by external cache > ------------------------------------ > > Key: SPARK-48694 > URL: https://issues.apache.org/jira/browse/SPARK-48694 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Affects Versions: 3.3.1, 3.4.0, 3.5.1 > Reporter: Yan Ma > Priority: Critical > Labels: pull-request-available > > We have a scenario that use Spark together with a 3rd party file source > cache, which is an independent lib and has its internal logic for cache entry > creation, eviction and remove. Currently we allocate dedicated memory for > this cache but the problem is that the memory can't be shared along with > Spark execution/storage memory. It will be more effective for memory usage if > we can count it in the _*UnifiedMemoryManager*_ and include it in memory > spill logic. > We also have a requirement of memory management for a native RDD cache > implementation. The existing interfaces in _*MemoryStore*_ is generally bound > with Spark SerializerManager and BlockEvictionHandler. It's not easy to > extend for such customized RDD cache. > For both of above requirements, we can provide a common memory register for > external memory usage, beyond current RDD cache in storage memory. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org