Thank you Hemant and Enrico. Much appreciated.
your input really got me closer to the issue, I realized every task didn't
get enough memory and hence tasks with large partitions kept failing. I
increased executor memory and at the same time increased number of
partitions as well. This made the
Note that repartitioning helps to increase the number of partitions (and
hence to reduce the size of partitions and required executor memory),
but subsequent transformations like join will repartition data again
with the configured number of partitions
(|spark.sql.shuffle.partitions|),
You can try repartitioning the data, if it’s a skewed data then you may
need to salt the keys for better partitioning.
Are you using a coalesce or any other fn which brings the data to lesser
nodes. Window function also incurs shuffling that could be an issue.
On Mon, 6 Jan 2020 at 9:49 AM, Rishi
Thanks Hemant, underlying data volume increased from 550GB to 690GB and now
the same job doesn't succeed. I tried incrementing executor memory to 20G
as well, still fails. I am running this in Databricks and start cluster
with 20G assigned to spark.executor.memory property.
Also some more
You can try increasing the executor memory, generally this error comes when
there is not enough memory in individual executors.
Job is getting completed may be because when tasks are re-scheduled it
would be going through.
Thanks.
On Mon, 6 Jan 2020 at 5:47 AM, Rishi Shah wrote:
> Hello All,
>
Hello All,
One of my jobs, keep getting into this situation where 100s of tasks keep
failing with below error but job eventually completes.
org.apache.spark.memory.SparkOutOfMemoryError: Unable to acquire 16384
bytes of memory
Could someone advice?
--
Regards,
Rishi Shah