[ 
https://issues.apache.org/jira/browse/SPARK-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14340195#comment-14340195
 ] 

Thomas Graves commented on SPARK-6050:
--------------------------------------

Thanks for investigating this more.

This is because on that hadoop cluster cpu scheduling isn't turned on.  When 
its not turned on,it defaults to 1 core.

@mridulm are you requesting 2 cores in your config?  I didn't see it in your 
example SparkPi command but going by the log command I assume you are.  

15/02/27 06:37:33 INFO YarnAllocator: Will request 1 executor containers, each 
with 2 cores and 32870 MB memory including 2150 MB overhead

If that is the case just remove the config or change to 1 core.  Other then 
that I don't know there is anything Spark can do as it doesn't know how Hadoop 
is configured.  The change that was made is to now use more of the Hadoop 
AMClient which adds the matching for the container requests. We just weren't 
checking that the cores matched before in Spark 1.2.  

We could theorectically look at the hadoop configs but that could get pretty 
hairy quickly as there are different schedulers and then within the scheduler 
it handles memory/cores differently.  Many clusters don't have scheduler 
configs on gateway boxes also. One thing we should do is add better logging as 
to why they don't match, but the match routine is also in the Hadoop AMClient 
code so its more us printing exactly what came back with the allocate response. 
 I can also file Hadoop jira to log information when it doesn't match.  The 
other option would be to write our own match routine.

Thoughts?






> Spark on YARN does not work --executor-cores is specified
> ---------------------------------------------------------
>
>                 Key: SPARK-6050
>                 URL: https://issues.apache.org/jira/browse/SPARK-6050
>             Project: Spark
>          Issue Type: Bug
>          Components: YARN
>    Affects Versions: 1.3.0
>         Environment: 2.5 based YARN cluster.
>            Reporter: Mridul Muralidharan
>            Priority: Blocker
>
> There are multiple issues here (which I will detail as comments), but to 
> reproduce running the following ALWAYS hangs in our cluster with the 1.3 RC
> ./bin/spark-submit --class org.apache.spark.examples.SparkPi     --master 
> yarn-cluster --executor-cores 8    --num-executors 15     --driver-memory 4g  
>    --executor-memory 2g          --queue webmap     lib/spark-examples*.jar   
>   10



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to