spark sql jobs heap memory

Koert Kuipers Wed, 23 Nov 2016 10:54:07 -0800

we are testing Dataset/Dataframe jobs instead of RDD jobs. one thing we
keep running into is containers getting killed by yarn. i realize this has
to do with off-heap memory, and the suggestion is to increase
spark.yarn.executor.memoryOverhead.


at times our memoryOverhead is as large as the executor memory (say 4G and
4G).

why is Dataset/Dataframe using so much off heap memory?

we havent changed spark.memory.offHeap.enabled which defaults to false.
should we enable that to get a better handle on this?

spark sql jobs heap memory

Reply via email to