[
https://issues.apache.org/jira/browse/SPARK-10644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933422#comment-14933422
]
Madhusudanan Kandasamy commented on SPARK-10644:
------------------------------------------------
One possible reason could be not setting SPARK_WORKER_MEMORY &&
spark.executor.memory.
The default worker memory is total memory minus 1 GB i.e 5.7G in your case and
default executor memory is 1G - which would make 5 executors possible in 1
node, a total 35 executors(7 *5) not 63.
Just to make sure we are on the same page, you have 7 nodes with the following
env setup..
SPARK_WORKER_INSTANCES=3
SPARK_WORKER_CORES=3
SPARK_WORKER_MEMORY=??
App configuration..
spark.executor.cores=1
spark.cores.max = 10
spark.executor.memory=512m
> Applications wait even if free executors are available
> ------------------------------------------------------
>
> Key: SPARK-10644
> URL: https://issues.apache.org/jira/browse/SPARK-10644
> Project: Spark
> Issue Type: Bug
> Components: Scheduler
> Affects Versions: 1.5.0
> Environment: RHEL 6.5 64 bit
> Reporter: Balagopal Nair
> Priority: Minor
>
> Number of workers: 21
> Number of executors: 63
> Steps to reproduce:
> 1. Run 4 jobs each with max cores set to 10
> 2. The first 3 jobs run with 10 each. (30 executors consumed so far)
> 3. The 4 th job waits even though there are 33 idle executors.
> The reason is that a job will not get executors unless
> the total number of EXECUTORS in use < the number of WORKERS
> If there are executors available, resources should be allocated to the
> pending job.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]