[ https://issues.apache.org/jira/browse/SPARK-8881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Or closed SPARK-8881. ---------------------------- Resolution: Fixed > Standalone mode scheduling fails because cores assignment is not atomic > ----------------------------------------------------------------------- > > Key: SPARK-8881 > URL: https://issues.apache.org/jira/browse/SPARK-8881 > Project: Spark > Issue Type: Bug > Components: Deploy > Affects Versions: 1.4.0, 1.5.0 > Reporter: Nishkam Ravi > Assignee: Nishkam Ravi > Priority: Critical > Fix For: 1.4.2, 1.5.0 > > > Current scheduling algorithm (in Master.scala) has two issues: > 1. cores are allocated one at a time instead of spark.executor.cores at a time > 2. when spark.cores.max/spark.executor.cores < num_workers, executors are not > launched and the app hangs (due to 1) > === Edit by Andrew === > Here's an example from the PR. Let's say we have 4 workers with 16 cores > each. We set `spark.cores.max` to 48 and `spark.executor.cores` to 16. > Because in spread out mode, the existing code allocates 1 core at a time, we > end up allocating 12 cores on each worker, and no executors can be launched > because each one wants at least 16 cores. Instead, we should allocate 16 > cores at a time. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org