Hi!

For dynamic allocation you do not need to run the Spark jobs in parallel.
Dynamic allocation simply means Spark scales up by requesting more
executors when there are pending tasks (which kinda related to the
available partitions) and scales down when the executor is idle (as within
one job the number of partitions can fluctuate).

But if you optimize for run time you can start those jobs in parallel at
the beginning.
In this case you will use higher number of executors even from the
beginning.

The "spark.dynamicAllocation.schedulerBacklogTimeout" is not for to
schedule/synchronize different Spark jobs but it is about tasks.

Best regards,
Attila

On Tue, Apr 6, 2021 at 1:59 PM Ranju Jain <ranju.j...@ericsson.com.invalid>
wrote:

> Hi All,
>
>
>
> I have set dynamic allocation enabled while running spark on Kubernetes .
> But new executors are requested if pending tasks are backlogged for more
> than configured duration in property
> *“spark.dynamicAllocation.schedulerBacklogTimeout”*.
>
>
>
> My Use Case is:
>
>
>
> There are number of parallel jobs which might or might not run together at
> a particular point of time. E.g Only One Spark Job may run at a point of
> time or two spark jobs may run at a single point of time depending upon the
> need.
>
> I configured spark.dynamicAllocation.minExecutors as 3 and
> spark.dynamicAllocation.maxExecutors as 8 .
>
>
>
> Steps:
>
>    1. SparkContext initialized with 3 executors and First Job requested.
>    2. Now, if second job requested after few mins  (e.g 15 mins) , I am
>    thinking if I can use the benefit of dynamic allocation and executor should
>    scale up to handle second job tasks.
>
> For this I think *“spark.dynamicAllocation.schedulerBacklogTimeout”*
> needs to set after which new executors would be requested.
>
> *Problem: *Problem is there are chances that second job is not requested
> at all or may be requested after 10 mins or after 20 mins. How can I set a
> constant value for
>
> property *“spark.dynamicAllocation.schedulerBacklogTimeout” *to scale the
> executors , when tasks backlog is dependent upon the number of jobs
> requested.
>
>
>
> Regards
>
> Ranju
>

Reply via email to