[ https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14173951#comment-14173951 ]
Marcelo Vanzin commented on SPARK-3174: --------------------------------------- bq. Lets say we start the Spark application with just 2 executors. It will double the number of executors and hence goes to 4, 8 and so on. Well, I'd say it's unusual for applications to start with a low number of executors, especially if the user knows it will be executing things right away. So if I start it with 32 executors, your code will right away try to make it 64. Andrew's approach would try to make it 33, then 35, then... But I agree that it might be a good idea to make the auto-scaling backend an interface, so that we can easily play with different approaches. That shouldn't be hard at all. bq. The main point being, It does all these without making any changes in TaskSchedulerImpl/TaskSetManager Theoretically, I agree that's a good thing. I haven't gone through the code in detail, though, to know whether all the information Andrew is using from the scheduler is available from SparkListener events. If you can derive that info, great, I think it would be worth it to make the auto-scale code decoupled from the scheduler. If not, then we either have the choice of hooking the auto-scaling backend into the scheduler (like Andrew's change) or exposing more info in the events - which may or may not be a good thing, depending on what that info is. Anyway, as I've said, both approaches are not irreconcilably different - they're actually more similar than not. > Provide elastic scaling within a Spark application > -------------------------------------------------- > > Key: SPARK-3174 > URL: https://issues.apache.org/jira/browse/SPARK-3174 > Project: Spark > Issue Type: Improvement > Components: Spark Core, YARN > Affects Versions: 1.0.2 > Reporter: Sandy Ryza > Assignee: Andrew Or > Attachments: SPARK-3174design.pdf, SparkElasticScalingDesignB.pdf, > dynamic-scaling-executors-10-6-14.pdf > > > A common complaint with Spark in a multi-tenant environment is that > applications have a fixed allocation that doesn't grow and shrink with their > resource needs. We're blocked on YARN-1197 for dynamically changing the > resources within executors, but we can still allocate and discard whole > executors. > It would be useful to have some heuristics that > * Request more executors when many pending tasks are building up > * Discard executors when they are idle > See the latest design doc for more information. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org