[ 
https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14162685#comment-14162685
 ] 

Sandy Ryza commented on SPARK-3174:
-----------------------------------

bq.  Maybe it makes sense to just call it `spark.dynamicAllocation.*`
That sounds good to me.

bq. I think in general we should limit the number of things that will affect 
adding/removing executors. Otherwise an application might get/lose many 
executors all of a sudden without a good understanding of why. Also 
anticipating what's needed in a future stage is usually fairly difficult, 
because you don't know a priori how long each stage is running. I don't see a 
good metric to decide how far in the future to anticipate for.

Consider the (common) case of a user keeping a Hive session open and setting a 
low number of minimum executors in order to not sit on cluster resources when 
idle.  Goal number 1 should be making queries return as fast as possible.  A 
policy that, upon receiving a job, simply requested executors with enough slots 
to handle all the tasks required by the first stage would be a vast latency and 
user experience improvement over the exponential increase policy.  Given that 
resource managers like YARN will mediate fairness between users and that Spark 
will be able to give executors back, there's not much advantage to being 
conservative or ramping up slowly in this case.  Accurately anticipating 
resource needs is difficult, but not necessary.

> Provide elastic scaling within a Spark application
> --------------------------------------------------
>
>                 Key: SPARK-3174
>                 URL: https://issues.apache.org/jira/browse/SPARK-3174
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core, YARN
>    Affects Versions: 1.0.2
>            Reporter: Sandy Ryza
>            Assignee: Andrew Or
>         Attachments: SPARK-3174design.pdf, 
> dynamic-scaling-executors-10-6-14.pdf
>
>
> A common complaint with Spark in a multi-tenant environment is that 
> applications have a fixed allocation that doesn't grow and shrink with their 
> resource needs.  We're blocked on YARN-1197 for dynamically changing the 
> resources within executors, but we can still allocate and discard whole 
> executors.
> It would be useful to have some heuristics that
> * Request more executors when many pending tasks are building up
> * Discard executors when they are idle
> See the latest design doc for more information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to