[ https://issues.apache.org/jira/browse/SPARK-10790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906912#comment-14906912 ]
Jonathan Kelly commented on SPARK-10790: ---------------------------------------- I can reproduce it with minExecutors=N and initialExecutors=unset and having the job's first stage require <= N tasks. I just realized that for some reason I can't reproduce it with minExecutors=N and initialExecutors=N, even though that should be the same as minExecutors=N and initialExecutors=unset, right? Anyway, please try the following: spark-submit -c spark.dynamicAllocation.minExecutors=1 --class org.apache.spark.examples.SparkPi spark-examples-1.5.0-hadoop2.6.0.jar 1 This one hangs for me. This one does not: spark-submit -c spark.dynamicAllocation.minExecutors=1 -c spark.dynamicAllocation.initialExecutors=1 --class org.apache.spark.examples.SparkPi spark-examples-1.5.0-hadoop2.6.0.jar 1 So maybe the problem has more to do with a hang being possible if initialExecutors is left unset? > Dynamic Allocation does not request any executors if first stage needs less > than or equal to spark.dynamicAllocation.initialExecutors > ------------------------------------------------------------------------------------------------------------------------------------- > > Key: SPARK-10790 > URL: https://issues.apache.org/jira/browse/SPARK-10790 > Project: Spark > Issue Type: Bug > Components: Scheduler > Affects Versions: 1.5.0 > Reporter: Jonathan Kelly > > If you set spark.dynamicAllocation.initialExecutors > 0 (or > spark.dynamicAllocation.minExecutors, since > spark.dynamicAllocation.initialExecutors defaults to > spark.dynamicAllocation.minExecutors), and the number of tasks in the first > stage of your job is less than or equal to this min/init number of executors, > dynamic allocation won't actually request any executors and will just hang > indefinitely with the warning "Initial job has not accepted any resources; > check your cluster UI to ensure that workers are registered and have > sufficient resources". > The cause appears to be that ExecutorAllocationManager does not request any > executors while the application is still initializing, but it still sets the > initial value of numExecutorsTarget to > spark.dynamicAllocation.initialExecutors. Once the job is running and has > submitted its first task, if the first task does not need more than > spark.dynamicAllocation.initialExecutors, > ExecutorAllocationManager.updateAndSyncNumExecutorsTarget() does not think > that it needs to request any executors, so it doesn't. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org