[ 
https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14169856#comment-14169856
 ] 

Andrew Or commented on SPARK-3174:
----------------------------------

[~sandyr]
bq. Consider the (common) case of a user keeping a Hive session open and 
setting a low number of minimum executors in order to not sit on cluster 
resources when idle. Goal number 1 should be making queries return as fast as 
possible. A policy that, upon receiving a job, simply requested executors with 
enough slots to handle all the tasks required by the first stage would be a 
vast latency and user experience improvement over the exponential increase 
policy. Given that resource managers like YARN will mediate fairness between 
users and that Spark will be able to give executors back, there's not much 
advantage to being conservative or ramping up slowly in this case. Accurately 
anticipating resource needs is difficult, but not necessary.

Yes, in this case we may want to get back many executors quickly, but this can 
be achieved even in the exponential increase model because we expose a config 
that regulates how often executors should be added. Slow-start is actually not 
slow at all if we lower the interval between which we add executors. I just 
think that this model gives the application more control and flexibility than 
one where you get either all executors or very few of them.

> Provide elastic scaling within a Spark application
> --------------------------------------------------
>
>                 Key: SPARK-3174
>                 URL: https://issues.apache.org/jira/browse/SPARK-3174
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core, YARN
>    Affects Versions: 1.0.2
>            Reporter: Sandy Ryza
>            Assignee: Andrew Or
>         Attachments: SPARK-3174design.pdf, 
> dynamic-scaling-executors-10-6-14.pdf
>
>
> A common complaint with Spark in a multi-tenant environment is that 
> applications have a fixed allocation that doesn't grow and shrink with their 
> resource needs.  We're blocked on YARN-1197 for dynamically changing the 
> resources within executors, but we can still allocate and discard whole 
> executors.
> It would be useful to have some heuristics that
> * Request more executors when many pending tasks are building up
> * Discard executors when they are idle
> See the latest design doc for more information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to