I did some instrumentation to figure out traces of where DirectByteBuffers
are being created and it turns out that setting the following system
properties in addition to setting spark.shuffle.io.preferDirectBufs=false
in spark config:

io.netty.noUnsafe=true
io.netty.threadLocalDirectBufferSize=0

This should force netty to mostly use on heap buffers and thus increases
the stability of spark jobs that perform a lot of shuffle. I have created
the defect SPARK-18787 to either force these settings when
spark.shuffle.io.preferDirectBufs=false is set in spark config or document
it.

Hope it will be helpful for other users as well.

Thanks,
Aniket

On Sat, Nov 26, 2016 at 3:31 PM Koert Kuipers <ko...@tresata.com> wrote:

> i agree that offheap memory usage is unpredictable.
>
> when we used rdds the memory was mostly on heap and total usage
> predictable, and we almost never had yarn killing executors.
>
> now with dataframes the memory usage is both on and off heap, and we have
> no way of limiting the off heap memory usage by spark, yet yarn requires a
> maximum total memory usage and if you go over it yarn kills the executor.
>
> On Fri, Nov 25, 2016 at 12:14 PM, Aniket Bhatnagar <
> aniket.bhatna...@gmail.com> wrote:
>
> Thanks Rohit, Roddick and Shreya. I tried
> changing spark.yarn.executor.memoryOverhead to be 10GB and lowering
> executor memory to 30 GB and both of these didn't work. I finally had to
> reduce the number of cores per executor to be 18 (from 36) in addition to
> setting higher spark.yarn.executor.memoryOverhead and lower executor memory
> size. I had to trade off performance for reliability.
>
> Unfortunately, spark does a poor job reporting off heap memory usage. From
> the profiler, it seems that the job's heap usage is pretty static but the
> off heap memory fluctuates quiet a lot. It looks like bulk of off heap is
> used by io.netty.buffer.UnpooledUnsafeDirectByteBuf while the shuffle
> client is trying to read block from shuffle service. It looks
> like org.apache.spark.network.util.TransportFrameDecoder retains them
> in buffers field while decoding responses from the shuffle service. So far,
> it's not clear why it needs to hold multiple GBs in the buffers. Perhaps
> increasing the number of partitions may help with this.
>
> Thanks,
> Aniket
>
> On Fri, Nov 25, 2016 at 1:09 AM Shreya Agarwal <shrey...@microsoft.com>
> wrote:
>
> I don’t think it’s just memory overhead. It might be better to use an
> execute with lesser heap space(30GB?). 46 GB would mean more data load into
> memory and more GC, which can cause issues.
>
>
>
> Also, have you tried to persist data in any way? If so, then that might be
> causing an issue.
>
>
>
> Lastly, I am not sure if your data has a skew and if that is forcing a lot
> of data to be on one executor node.
>
>
>
> Sent from my Windows 10 phone
>
>
>
> *From: *Rodrick Brown <rodr...@orchardplatform.com>
> *Sent: *Friday, November 25, 2016 12:25 AM
> *To: *Aniket Bhatnagar <aniket.bhatna...@gmail.com>
> *Cc: *user <user@spark.apache.org>
> *Subject: *Re: OS killing Executor due to high (possibly off heap) memory
> usage
>
>
> Try setting spark.yarn.executor.memoryOverhead 10000
>
> On Thu, Nov 24, 2016 at 11:16 AM, Aniket Bhatnagar <
> aniket.bhatna...@gmail.com> wrote:
>
> Hi Spark users
>
> I am running a job that does join of a huge dataset (7 TB+) and the
> executors keep crashing randomly, eventually causing the job to crash.
> There are no out of memory exceptions in the log and looking at the dmesg
> output, it seems like the OS killed the JVM because of high memory usage.
> My suspicion is towards off heap usage of executor is causing this as I am
> limiting the on heap usage of executor to be 46 GB and each host running
> the executor has 60 GB of RAM. After the executor crashes, I can see that
> the external shuffle manager
> (org.apache.spark.network.server.TransportRequestHandler) logs a lot of
> channel closed exceptions in yarn node manager logs. This leads me to
> believe that something triggers out of memory during shuffle read. Is there
> a configuration to completely disable usage of off heap memory? I have
> tried setting spark.shuffle.io.preferDirectBufs=false but the executor is
> still getting killed by the same error.
>
> Cluster details:
> 10 AWS c4.8xlarge hosts
> RAM on each host - 60 GB
> Number of cores on each host - 36
> Additional hard disk on each host - 8 TB
>
> Spark configuration:
> dynamic allocation enabled
> external shuffle service enabled
> spark.driver.memory 1024M
> spark.executor.memory 47127M
> Spark master yarn-cluster
>
> Sample error in yarn node manager:
> 2016-11-24 10:34:06,507 ERROR
> org.apache.spark.network.server.TransportRequestHandler
> (shuffle-server-50): Error sending result
> ChunkFetchSuccess{streamChunkId=StreamChunkId{streamId=919299554123,
> chunkIndex=0},
> buffer=FileSegmentManagedBuffer{file=/mnt3/yarn/usercache/hadoop/appcache/application_1479898345621_0006/blockmgr-ad5301a9-e1e9-4723-a8c4-9276971b2259/2c/shuffle_3_963_0.data,
> offset=0, length=669014456}} to /10.192.108.170:52782; closing connection
> java.nio.channels.ClosedChannelException
>
> Error in dmesg:
> [799873.309897] Out of memory: Kill process 50001 (java) score 927 or
> sacrifice child
> [799873.314439] Killed process 50001 (java) total-vm:65652448kB,
> anon-rss:57246528kB, file-rss:0kB
>
> Thanks,
> Aniket
>
>
>
>
> --
>
> [image: Orchard Platform] <http://www.orchardplatform.com/>
>
> *Rodrick Brown */ *DevOPs*
>
> 9174456839 / rodr...@orchardplatform.com
>
> Orchard Platform
> 101 5th Avenue, 4th Floor, New York, NY
>
> *NOTICE TO RECIPIENTS*: This communication is confidential and intended
> for the use of the addressee only. If you are not an intended recipient of
> this communication, please delete it immediately and notify the sender by
> return email. Unauthorized reading, dissemination, distribution or copying
> of this communication is prohibited. This communication does not constitute
> an offer to sell or a solicitation of an indication of interest to purchase
> any loan, security or any other financial product or instrument, nor is it
> an offer to sell or a solicitation of an indication of interest to purchase
> any products or services to any persons who are prohibited from receiving
> such information under applicable law. The contents of this communication
> may not be accurate or complete and are subject to change without notice.
> As such, Orchard App, Inc. (including its subsidiaries and affiliates,
> "Orchard") makes no representation regarding the accuracy or completeness
> of the information contained herein. The intended recipient is advised to
> consult its own professional advisors, including those specializing in
> legal, tax and accounting matters. Orchard does not provide legal, tax or
> accounting advice.
>
>
>

Reply via email to