Dynamic Allocation Backlog Property in Spark on Kubernetes

Ranju Jain Tue, 06 Apr 2021 04:58:44 -0700

Hi All,

I have set dynamic allocation enabled while running spark on Kubernetes . But 
new executors are requested if pending tasks are backlogged for more than 
configured duration in property 
"spark.dynamicAllocation.schedulerBacklogTimeout".


My Use Case is:

There are number of parallel jobs which might or might not run together at a 
particular point of time. E.g Only One Spark Job may run at a point of time or 
two spark jobs may run at a single point of time depending upon the need.
I configured spark.dynamicAllocation.minExecutors as 3 and 
spark.dynamicAllocation.maxExecutors as 8 .

Steps:

  1.  SparkContext initialized with 3 executors and First Job requested.
  2.  Now, if second job requested after few mins  (e.g 15 mins) , I am 
thinking if I can use the benefit of dynamic allocation and executor should 
scale up to handle second job tasks.

For this I think "spark.dynamicAllocation.schedulerBacklogTimeout" needs to set 
after which new executors would be requested.

Problem: Problem is there are chances that second job is not requested at all 
or may be requested after 10 mins or after 20 mins. How can I set a constant 
value for

property "spark.dynamicAllocation.schedulerBacklogTimeout" to scale the 
executors , when tasks backlog is dependent upon the number of jobs requested.


Regards
Ranju

Dynamic Allocation Backlog Property in Spark on Kubernetes

Reply via email to