Hi,
Spark dynamic resource allocation cannot solve my problem, because the
resources of the production environment are limited. I expect that under this
premise, by reserving resources to ensure that job tasks of different groups
can be scheduled in time.
Thank you,
Bowen Song
From: Qian SUN
Sent: Wednesday, May 18, 2022 9:32
To: Bowen Song
Cc: user.spark
Subject: Re: A scene with unstable Spark performance
Hi. I think you need Spark dynamic resource allocation. Please refer to
https://spark.apache.org/docs/latest/job-scheduling.html#dynamic-resource-allocation.
And If you use Spark SQL, AQE maybe help.
https://spark.apache.org/docs/latest/sql-performance-tuning.html#adaptive-query-execution
Bowen Song mailto:bowen.s...@kyligence.io>>
于2022年5月17日周二 22:33写道:
Hi all,
I find Spark performance is unstable in this scene: we divided the jobs into
two groups according to the job completion time. One group of jobs had an
execution time of less than 10s, and the other group of jobs had an execution
time from 10s to 300s. The reason for the difference is that the latter will
scan more files, that is, the number of tasks will be larger. When the two
groups of jobs were submitted to Spark for execution, I found that due to
resource competition, the existence of the slower jobs made the original faster
job take longer to return the result, which manifested as unstable Spark
performance. The problem I want to solve is: Can we reserve certain resources
for each of the two groups, so that the fast jobs can be scheduled in time, and
the slow jobs will not be starved to death because the resources are completely
allocated to the fast jobs.
In this context, I need to group spark jobs, and the tasks from different
groups of jobs can be scheduled using group reserved resources. At the
beginning of each round of scheduling, tasks in this group will be scheduled
first, only when there are no tasks in this group to schedule, its resources
can be allocated to other groups to avoid idling of resources.
For the consideration of resource utilization and the overhead of managing
multiple clusters, I hope that the jobs can share the spark cluster, rather
than creating private clusters for the groups.
I've read the code for the Spark Fair Scheduler, and the implementation doesn't
seem to meet the need to reserve resources for different groups of job.
Is there a workaround that can solve this problem through Spark Fair Scheduler?
If it can't be solved, would you consider adding a mechanism like capacity
scheduling.
Thank you,
Bowen Song
--
Best!
Qian SUN