Hey all,

I had a question about the MemoryStore for the BlockManager with the
unified memory manager v.s. the legacy mode.

In the unified format, I would expect the max size of the MemoryStore to be

<total max memory> * <spark.memory.fraction> *
<spark.memory.storageFraction>

 in the same way that when using the StaticMemoryManager it is

<total max memory> * <spark.storage.memoryFraction> *
<spark.storage.safetyFraction>.

Instead it appears to be

(<total max memory> * <spark.memory.fraction>) -
onHeapExecutionMemoryPool.memoryUsed

I would expect onHeapExecutionMemoryPool.memoryUsed to be around size 0 at
the time of initialization, so this may be overcommitting memory to the
MemoryStore. Does this line up with expectations?

For context, I have a very large SQL query that does a significant amount
of broadcast joins (~100) so I need my driver to be able to toss those into
the DiskStore without going into GC hell. Since my MemoryStore seems to not
be bounded by the storage pool, this fills up my heap and causes my
application to OOM. Simply reducing spark.memory.fraction alleviates the
problem, but I'd love to understand if that is actually the correct fix
here rather than simply lowering the storageFraction.

Thanks!
-Pat

Reply via email to