Hi. I think you need Spark dynamic resource allocation. Please refer to https://spark.apache.org/docs/latest/job-scheduling.html#dynamic-resource-allocation . And If you use Spark SQL, AQE maybe help. https://spark.apache.org/docs/latest/sql-performance-tuning.html#adaptive-query-execution
Bowen Song <bowen.s...@kyligence.io> 于2022年5月17日周二 22:33写道: > Hi all, > > > > I find Spark performance is unstable in this scene: we divided the jobs > into two groups according to the job completion time. One group of jobs had > an execution time of less than 10s, and the other group of jobs had an > execution time from 10s to 300s. The reason for the difference is that the > latter will scan more files, that is, the number of tasks will be larger. > When the two groups of jobs were submitted to Spark for execution, I found > that due to resource competition, the existence of the slower jobs made the > original faster job take longer to return the result, which manifested as > unstable Spark performance. The problem I want to solve is: Can we reserve > certain resources for each of the two groups, so that the fast jobs can be > scheduled in time, and the slow jobs will not be starved to death because > the resources are completely allocated to the fast jobs. > > > > In this context, I need to group spark jobs, and the tasks from different > groups of jobs can be scheduled using group reserved resources. At the > beginning of each round of scheduling, tasks in this group will be > scheduled first, only when there are no tasks in this group to schedule, > its resources can be allocated to other groups to avoid idling of resources. > > > > For the consideration of resource utilization and the overhead of managing > multiple clusters, I hope that the jobs can share the spark cluster, rather > than creating private clusters for the groups. > > > > I've read the code for the Spark Fair Scheduler, and the implementation > doesn't seem to meet the need to reserve resources for different groups of > job. > > > > Is there a workaround that can solve this problem through Spark Fair > Scheduler? If it can't be solved, would you consider adding a mechanism > like capacity scheduling. > > > > Thank you, > > Bowen Song > -- Best! Qian SUN