Nope, it's a batch job.

Best Regards
Ankit Khettry

On Sat, 7 Sep, 2019, 6:52 AM Upasana Sharma, <028upasana...@gmail.com>
wrote:

> Is it a streaming job?
>
> On Sat, Sep 7, 2019, 5:04 AM Ankit Khettry <justankit2...@gmail.com>
> wrote:
>
>> I have a Spark job that consists of a large number of Window operations
>> and hence involves large shuffles. I have roughly 900 GiBs of data,
>> although I am using a large enough cluster (10 * m5.4xlarge instances). I
>> am using the following configurations for the job, although I have tried
>> various other combinations without any success.
>>
>> spark.yarn.driver.memoryOverhead 6g
>> spark.storage.memoryFraction 0.1
>> spark.executor.cores 6
>> spark.executor.memory 36g
>> spark.memory.offHeap.size 8g
>> spark.memory.offHeap.enabled true
>> spark.executor.instances 10
>> spark.driver.memory 14g
>> spark.yarn.executor.memoryOverhead 10g
>>
>> I keep running into the following OOM error:
>>
>> org.apache.spark.memory.SparkOutOfMemoryError: Unable to acquire 16384
>> bytes of memory, got 0
>> at
>> org.apache.spark.memory.MemoryConsumer.throwOom(MemoryConsumer.java:157)
>> at
>> org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:98)
>> at
>> org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter.<init>(UnsafeInMemorySorter.java:128)
>> at
>> org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.<init>(UnsafeExternalSorter.java:163)
>>
>> I see there are a large number of JIRAs in place for similar issues and a
>> great many of them are even marked resolved.
>> Can someone guide me as to how to approach this problem? I am using
>> Databricks Spark 2.4.1.
>>
>> Best Regards
>> Ankit Khettry
>>
>

Reply via email to