[ 
https://issues.apache.org/jira/browse/SPARK-10790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906912#comment-14906912
 ] 

Jonathan Kelly commented on SPARK-10790:
----------------------------------------

I can reproduce it with minExecutors=N and initialExecutors=unset and having 
the job's first stage require <= N tasks. I just realized that for some reason 
I can't reproduce it with minExecutors=N and initialExecutors=N, even though 
that should be the same as minExecutors=N and initialExecutors=unset, right?

Anyway, please try the following:

spark-submit -c spark.dynamicAllocation.minExecutors=1 --class 
org.apache.spark.examples.SparkPi spark-examples-1.5.0-hadoop2.6.0.jar 1

This one hangs for me.

This one does not:

spark-submit -c spark.dynamicAllocation.minExecutors=1 -c 
spark.dynamicAllocation.initialExecutors=1 --class 
org.apache.spark.examples.SparkPi spark-examples-1.5.0-hadoop2.6.0.jar 1

So maybe the problem has more to do with a hang being possible if 
initialExecutors is left unset?

> Dynamic Allocation does not request any executors if first stage needs less 
> than or equal to spark.dynamicAllocation.initialExecutors
> -------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-10790
>                 URL: https://issues.apache.org/jira/browse/SPARK-10790
>             Project: Spark
>          Issue Type: Bug
>          Components: Scheduler
>    Affects Versions: 1.5.0
>            Reporter: Jonathan Kelly
>
> If you set spark.dynamicAllocation.initialExecutors > 0 (or 
> spark.dynamicAllocation.minExecutors, since 
> spark.dynamicAllocation.initialExecutors defaults to 
> spark.dynamicAllocation.minExecutors), and the number of tasks in the first 
> stage of your job is less than or equal to this min/init number of executors, 
> dynamic allocation won't actually request any executors and will just hang 
> indefinitely with the warning "Initial job has not accepted any resources; 
> check your cluster UI to ensure that workers are registered and have 
> sufficient resources".
> The cause appears to be that ExecutorAllocationManager does not request any 
> executors while the application is still initializing, but it still sets the 
> initial value of numExecutorsTarget to 
> spark.dynamicAllocation.initialExecutors. Once the job is running and has 
> submitted its first task, if the first task does not need more than 
> spark.dynamicAllocation.initialExecutors, 
> ExecutorAllocationManager.updateAndSyncNumExecutorsTarget() does not think 
> that it needs to request any executors, so it doesn't.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to