[ https://issues.apache.org/jira/browse/SPARK-31437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17112245#comment-17112245 ]
Thomas Graves commented on SPARK-31437: --------------------------------------- Originally when I thought about this briefly I was thinking of adding something to ResourceProfile similar what you suggest which would be option as to whether to always create new executors using the ExecutorResourceRequests specified in the ResourceProfile or try to reuse executors that exist that fit. I think what you are proposing is along those lines. Now there are a bunch of edge cases here, like what if there aren't any exist that it would fit in. That is a good reason to keep the ExecutorResourceRequirements in the Profile and make the user specify them even if it it might fit another one. Another option is I was wanting to give resource profiles names. You could potentially use that or perhaps gives the ExecutorResourceRequirements names as well and let user specify the names as optional to use if they already exist. >> We don't have to change any existing behaviour for cases like this. Just >> boot up new executors. Note it tooks like I had a typo the 4 cpus should be 4 gpus. I guess that is ok if you are using an exactly matches option for the ExecutorResourceRequests. Otherwise if its just a fits option then it would be harder to control. Let says you have 2 active profiles, one with 8 cores and 4 gpus, one with 8 cores and then you have a third where you want to start something that uses 2 cores. You tell the third to use existing executors but that means it could use the ones with 8 cores and 4 gpus and it would waste the gpus. If you tell it to not reuse executors then it would boot up more and not do what you really want. This might be where something like the names could come in as include/exclude lists, but then that gets more complicated for user as well. So overall if we just do the exactly matches for the ExecutorResourceRequests that could be fairly straightforward. The tracking in the allocation manager could just change from per resource profile to the per ExecutorResourceRequests and I think similarly in the scheduler it could do something like that. I would just want to make sure whatever end user api we come up with would be extensible to the other cases where it will fit. > Try assigning tasks to existing executors by which required resources in > ResourceProfile are satisfied > ------------------------------------------------------------------------------------------------------ > > Key: SPARK-31437 > URL: https://issues.apache.org/jira/browse/SPARK-31437 > Project: Spark > Issue Type: Improvement > Components: Scheduler, Spark Core > Affects Versions: 3.1.0 > Reporter: Hongze Zhang > Priority: Major > > By the change in [PR|https://github.com/apache/spark/pull/27773] of > SPARK-29154, submitted tasks are scheduled onto executors only if resource > profile IDs strictly match. As a result Spark always starts new executors for > customized ResourceProfiles. > This limitation makes working with process-local jobs unfriendly. E.g. Task > cores has been increased from 1 to 4 in a new stage, and executor has 8 > slots, it is expected that 2 new tasks can be run on the existing executor > but Spark starts new executors for new ResourceProfile. The behavior is > unnecessary. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org