[ 
https://issues.apache.org/jira/browse/SPARK-31437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084552#comment-17084552
 ] 

Hongze Zhang edited comment on SPARK-31437 at 5/20/20, 9:14 AM:
----------------------------------------------------------------

(edited)

Thanks [~tgraves]. I got your point of making them tied. 

Actually I was thinking of something like this:

1. to break ResourceProfile up to ExecutorResourceProfile and ResourceProfile;
2. ResourceProfile still contains both resource requirement of executor and 
task;
3. ExecutorResourceProfile only includes executor's resource req;
4. ExecutorResourceProfile is required to allocate new executor instances from 
scheduler backend; 
5. Similar to current solution, user specifies ResourceProfile for RDD, then 
tasks are scheduled onto executors that are allocated using 
ExecutorResourceProfile;
6. Each time ResourceProfile comes, ExecutorResourceProfile is created/selected 
within one of several strategies;

Strategies types:

s1. Always creates new ExecutorResourceProfile;
s2. If executor resource requirement in ResourceProfile meets existing 
ExecutorResourceProfile ("meets" may mean exactly matches/equals), use the 
existing one;
s3. ...

bq. My etl tasks uses 8 cores, my ml tasks use 8 cores and 4 cpus.  How do I 
keep my etl tasks from running on the ML executors without wasting resources?

We don't have to change any existing behaviour for cases like this. Just boot 
up new executors.

For now the major problem is that even ResourceProfile.executorResources is not 
changed in a new ResourceProfile (e.g. task.cpu changed from 1 to 2), we still 
have to shut down the old executors then start new ones, right? This is the 
thing to be optimized.




was (Author: zhztheplayer):
Thanks [~tgraves]. I got your point of making them tied. 

Actually I was thinking of something like this:

1. to break ResourceProfile up to ExecutorResourceProfile and ResourceProfile;
2. ResourceProfile still contains both resource requirement of executor and 
task;
3. ExecutorResourceProfile only includes executor's resource req;
4. ExecutorResourceProfile is required to allocate new executor instances from 
scheduler backend; 
5. Similar to current solution, user specifies ResourceProfile for RDD, then 
tasks are scheduled onto executors that are allocated using 
ExecutorResourceProfile;
6. Each time ResourceProfile comes, ExecutorResourceProfile is created/selected 
within one of several strategies;

Strategies types:

s1. Always creates new ExecutorResourceProfile;
s2. If executor resource requirement in ResourceProfile meets existing 
ExecutorResourceProfile, use the existing one;
s3. ...

bq. My etl tasks uses 8 cores, my ml tasks use 8 cores and 4 cpus.  How do I 
keep my etl tasks from running on the ML executors without wasting resources?

By just using strategy s1, everything should work as current implementation.



> Try assigning tasks to existing executors by which required resources in 
> ResourceProfile are satisfied
> ------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-31437
>                 URL: https://issues.apache.org/jira/browse/SPARK-31437
>             Project: Spark
>          Issue Type: Improvement
>          Components: Scheduler, Spark Core
>    Affects Versions: 3.1.0
>            Reporter: Hongze Zhang
>            Priority: Major
>
> By the change in [PR|https://github.com/apache/spark/pull/27773] of 
> SPARK-29154, submitted tasks are scheduled onto executors only if resource 
> profile IDs strictly match. As a result Spark always starts new executors for 
> customized ResourceProfiles.
> This limitation makes working with process-local jobs unfriendly. E.g. Task 
> cores has been increased from 1 to 4 in a new stage, and executor has 8 
> slots, it is expected that 2 new tasks can be run on the existing executor 
> but Spark starts new executors for new ResourceProfile. The behavior is 
> unnecessary.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to