can you elaborate on the above statement please. When you start yarn you start the resource manager daemon only on the resource manager node
yarn-daemon.sh start resourcemanager Then you start nodemanager deamons on all nodes yarn-daemon.sh start nodemanager A spark app has to start somewhere. That is SparkSubmit. and that is deterministic. I start SparkSubmit that talks to Yarn Resource Manager that initialises and registers an Application master. The crucial point is Yarn Resource manager which is basically a resource scheduler. It optimizes for cluster resource utilization to keep all resources in use all the time. However, resource manager itself is on the resource manager node. Now I always start my Spark app on the same node as the resource manager node and let Yarn take care of the rest. Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 7 June 2016 at 12:17, Jacek Laskowski <ja...@japila.pl> wrote: > Hi, > > It's not possible. YARN uses CPU and memory for resource constraints and > places AM on any node available. Same about executors (unless data locality > constraints the placement). > > Jacek > On 6 Jun 2016 1:54 a.m., "Saiph Kappa" <saiph.ka...@gmail.com> wrote: > >> Hi, >> >> In yarn-cluster mode, is there any way to specify on which node I want >> the driver to run? >> >> Thanks. >> >