I’ve got an a series of applications using a single standalone Spark cluster 
(v1.4.1).  The cluster has 1 master and 4 workers (4 CPUs per worker node). I 
am using the start-slave.sh script to launch the worker process on each node 
and I can see the nodes were successfully registered using the SparkUI.  When I 
launch one of my applications regardless of what I set spark.cores.max to when 
instantiating the SparkContext in the driver app I seem to get a single worker 
assigned to the application and all jobs that get run.  For example, if I set 
spark.cores.max to 16 the SparkUI will show a single worker take the load with 
4 (16 Used) in the Cores column.  How do I get my jobs run across multiple 
nodes in the cluster?

Here’s a snippet from the SparkUI (IP addresses removed for privacy)

Workers
Worker Id       Address State   Cores   Memory
worker-20150920064814-***-33659 ***:33659       ALIVE   4 (0 Used)      28.0 GB 
(0.0 B Used)
worker-20151012175609-***37399  ***:37399       ALIVE   4 (16 Used)     28.0 GB 
(28.0 GB Used)
worker-20151012181934-***-36573 ***:36573       ALIVE   4 (4 Used)      28.0 GB 
(28.0 GB Used)
worker-20151030170514-***-45368 ***:45368       ALIVE   4 (0 Used)      28.0 GB 
(0.0 B Used)
Running Applications
Application ID  Name    Cores   Memory per Node Submitted Time  User    State   
Duration
app-20151102194733-0278 App1    16      28.0 GB 2015/11/02 19:47:33     ***     
RUNNING 2 s
app-20151102164156-0274 App2    4       28.0 GB 2015/11/02 16:41:56     ***     
RUNNING 3.1 h
Jeff


This message (and any attachments) is intended only for the designated 
recipient(s). It
may contain confidential or proprietary information, or have other limitations 
on use as
indicated by the sender. If you are not a designated recipient, you may not 
review, use,
copy or distribute this message. If you received this in error, please notify 
the sender by
reply e-mail and delete this message.

Reply via email to