... using spark 3.2.1
________________________________
From: Shay Elbaz <[email protected]>
Sent: Tuesday, July 19, 2022 1:26 PM
To: [email protected] <[email protected]>
Cc: Jeffrey O'Donoghue <[email protected]>
Subject: [EXTERNAL] spark.executor.pyspark.memory not added to the executor
resource request on Kubernetes
ATTENTION: This email originated from outside of GM.
Hi,
We are trying tune executor memory on Kubernetes. Specifically, 8g for the jvm,
8g for the python process, and additional 500m overhead:
--conf spark.executor.memory=8g
--conf spark.executor.pyspark.memory=8g
--conf spark.executor.memoryOverhead=500m
According the docs, the executor pods should have 8+8+0.5 requested memory
(spark.executor.pyspark.memory: "... When PySpark is run in YARN or Kubernetes,
this memory is added to executor resource requests").
On Spark UI we can see the right configuration:
Executor Reqs:
cores: [amount: 1]
offHeap: [amount: 0]
memoryOverhead: [amount: 500]
pyspark.memory: [amount: 8192]
memory: [amount: 8192]
Task Reqs:
cpus: [amount: 1.0]
However, the running pod spec is different:
Limits:
memory: 8692Mi
Requests:
cpu: 1
memory: 8692Mi
Looks like pyspark.memory value was not added to the resource request.
What are we missing?
Thanks,
Shay