Ok… gotcha… wasn’t sure that YARN just looked at the heap size allocation and ignored the off heap.
WRT over all OS memory… this would be one reason why I’d keep a decent amount of swap around. (Maybe even putting it on a fast device like an .m2 or PCIe flash drive…. > On Sep 22, 2016, at 9:56 AM, Sean Owen <so...@cloudera.com> wrote: > > It's looking at the whole process's memory usage, and doesn't care > whether the memory is used by the heap or not within the JVM. Of > course, allocating memory off-heap still counts against you at the OS > level. > > On Thu, Sep 22, 2016 at 3:54 PM, Michael Segel > <msegel_had...@hotmail.com> wrote: >> Thanks for the response Sean. >> >> But how does YARN know about the off-heap memory usage? >> That’s the piece that I’m missing. >> >> Thx again, >> >> -Mike >> >>> On Sep 21, 2016, at 10:09 PM, Sean Owen <so...@cloudera.com> wrote: >>> >>> No, Xmx only controls the maximum size of on-heap allocated memory. >>> The JVM doesn't manage/limit off-heap (how could it? it doesn't know >>> when it can be released). >>> >>> The answer is that YARN will kill the process because it's using more >>> memory than it asked for. A JVM is always going to use a little >>> off-heap memory by itself, so setting a max heap size of 2GB means the >>> JVM process may use a bit more than 2GB of memory. With an off-heap >>> intensive app like Spark it can be a lot more. >>> >>> There's a built-in 10% overhead, so that if you ask for a 3GB executor >>> it will ask for 3.3GB from YARN. You can increase the overhead. >>> >>> On Wed, Sep 21, 2016 at 11:41 PM, Jörn Franke <jornfra...@gmail.com> wrote: >>>> All off-heap memory is still managed by the JVM process. If you limit the >>>> memory of this process then you limit the memory. I think the memory of the >>>> JVM process could be limited via the xms/xmx parameter of the JVM. This can >>>> be configured via spark options for yarn (be aware that they are different >>>> in cluster and client mode), but i recommend to use the spark options for >>>> the off heap maximum. >>>> >>>> https://spark.apache.org/docs/latest/running-on-yarn.html >>>> >>>> >>>> On 21 Sep 2016, at 22:02, Michael Segel <msegel_had...@hotmail.com> wrote: >>>> >>>> I’ve asked this question a couple of times from a friend who didn’t >>>> know >>>> the answer… so I thought I would try here. >>>> >>>> >>>> Suppose we launch a job on a cluster (YARN) and we have set up the >>>> containers to be 3GB in size. >>>> >>>> >>>> What does that 3GB represent? >>>> >>>> I mean what happens if we end up using 2-3GB of off heap storage via >>>> tungsten? >>>> What will Spark do? >>>> Will it try to honor the container’s limits and throw an exception or >>>> will >>>> it allow my job to grab that amount of memory and exceed YARN’s >>>> expectations since its off heap? >>>> >>>> Thx >>>> >>>> -Mike >>>> >>>> B‹KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKCB• È >>>> [œÝXœØÜšX™H K[XZ[ ˆ \Ù\‹][œÝXœØÜšX™P Ü \šË˜\ XÚ K›Ü™ÃBƒ >> --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org