Nope, it's a batch job. Best Regards Ankit Khettry
On Sat, 7 Sep, 2019, 6:52 AM Upasana Sharma, <028upasana...@gmail.com> wrote: > Is it a streaming job? > > On Sat, Sep 7, 2019, 5:04 AM Ankit Khettry <justankit2...@gmail.com> > wrote: > >> I have a Spark job that consists of a large number of Window operations >> and hence involves large shuffles. I have roughly 900 GiBs of data, >> although I am using a large enough cluster (10 * m5.4xlarge instances). I >> am using the following configurations for the job, although I have tried >> various other combinations without any success. >> >> spark.yarn.driver.memoryOverhead 6g >> spark.storage.memoryFraction 0.1 >> spark.executor.cores 6 >> spark.executor.memory 36g >> spark.memory.offHeap.size 8g >> spark.memory.offHeap.enabled true >> spark.executor.instances 10 >> spark.driver.memory 14g >> spark.yarn.executor.memoryOverhead 10g >> >> I keep running into the following OOM error: >> >> org.apache.spark.memory.SparkOutOfMemoryError: Unable to acquire 16384 >> bytes of memory, got 0 >> at >> org.apache.spark.memory.MemoryConsumer.throwOom(MemoryConsumer.java:157) >> at >> org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:98) >> at >> org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter.<init>(UnsafeInMemorySorter.java:128) >> at >> org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.<init>(UnsafeExternalSorter.java:163) >> >> I see there are a large number of JIRAs in place for similar issues and a >> great many of them are even marked resolved. >> Can someone guide me as to how to approach this problem? I am using >> Databricks Spark 2.4.1. >> >> Best Regards >> Ankit Khettry >> >