Can you check in your RM's web UI how much of each resource does Yarn
think you have available? You can also check that in the Yarn
configuration directly.

Perhaps it's not configured to use all of the available resources. (If
it was set up with Cloudera Manager, CM will reserve some room for
daemons that need to run on each machine, so it won't tell Yarn to
make all 32 cores / 64 GB available for applications.)

Also remember that Spark needs to start "num executors + 1" containers
when adding up all the needed resources. The extra container generally
requires less resources than the executors, but it still needs to
allocate resources from the RM.



On Tue, Nov 18, 2014 at 10:03 AM, Alan Prando <a...@scanboo.com.br> wrote:
> Hi Folks!
>
> I'm running Spark on YARN cluster installed with Cloudera Manager Express.
> The cluster has 1 master and 3 slaves, each machine with 32 cores and 64G
> RAM.
>
> My spark's job is working fine, however it seems that just 2 of 3 slaves are
> working (htop shows 2 slaves working 100% on 32 cores, and 1 slaves without
> any processing).
>
> I'm using this command:
> ./spark-submit --master yarn --num-executors 3 --executor-cores 32
> --executor-memory 32g feature_extractor.py -r 390
>
> Additionaly, spark's log testify communications with 2 slaves only:
> 14/11/18 17:19:38 INFO YarnClientSchedulerBackend: Registered executor:
> Actor[akka.tcp://sparkExecutor@ip-172-31-13-180.ec2.internal:33177/user/Executor#-113177469]
> with ID 1
> 14/11/18 17:19:38 INFO RackResolver: Resolved ip-172-31-13-180.ec2.internal
> to /default
> 14/11/18 17:19:38 INFO YarnClientSchedulerBackend: Registered executor:
> Actor[akka.tcp://sparkExecutor@ip-172-31-13-179.ec2.internal:51859/user/Executor#-323896724]
> with ID 2
> 14/11/18 17:19:38 INFO RackResolver: Resolved ip-172-31-13-179.ec2.internal
> to /default
> 14/11/18 17:19:38 INFO BlockManagerMasterActor: Registering block manager
> ip-172-31-13-180.ec2.internal:50959 with 16.6 GB RAM
> 14/11/18 17:19:39 INFO BlockManagerMasterActor: Registering block manager
> ip-172-31-13-179.ec2.internal:53557 with 16.6 GB RAM
> 14/11/18 17:19:51 INFO YarnClientSchedulerBackend: SchedulerBackend is ready
> for scheduling beginning after waiting maxRegisteredResourcesWaitingTime:
> 30000(ms)
>
> Is there a configuration to call spark's job on YARN cluster with all
> slaves?
>
> Thanks in advance! =]
>
> ---
> Regards
> Alan Vidotti Prando.
>
>



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to