[ 
https://issues.apache.org/jira/browse/SPARK-10644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14904831#comment-14904831
 ] 

Balagopal Nair commented on SPARK-10644:
----------------------------------------

I'm overallocating hardware here. 

Each machine has 4 cores and I'm launching 3 workers with 3 executors each.
I have 7 such machines which makes
Number of worker = 7 x 3 = 21
Number of cores/executors = 21 x 3 = 63

I found out this week that if change the configuration to launch just 1 worker 
process with 9 executors, this problem does NOT show up anymore.
So this issue seems specific to a case where you launch more than one Worker 
process per host. (I've reduced the priority of this bug to Minor because of 
this.)


> Applications wait even if free executors are available
> ------------------------------------------------------
>
>                 Key: SPARK-10644
>                 URL: https://issues.apache.org/jira/browse/SPARK-10644
>             Project: Spark
>          Issue Type: Bug
>          Components: Scheduler
>    Affects Versions: 1.5.0
>         Environment: RHEL 6.5 64 bit
>            Reporter: Balagopal Nair
>            Priority: Minor
>
> Number of workers: 21
> Number of executors: 63
> Steps to reproduce:
> 1. Run 4 jobs each with max cores set to 10
> 2. The first 3 jobs run with 10 each. (30 executors consumed so far)
> 3. The 4 th job waits even though there are 33 idle executors.
> The reason is that a job will not get executors unless 
> the total number of EXECUTORS in use < the number of WORKERS
> If there are executors available, resources should be allocated to the 
> pending job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to