can you elaborate on the above statement please.

When you start yarn you start the resource manager daemon only on the
resource manager node

yarn-daemon.sh start resourcemanager

Then you start nodemanager deamons on all nodes

yarn-daemon.sh start nodemanager

A spark app has to start somewhere. That is SparkSubmit. and that is
deterministic. I start SparkSubmit that talks to Yarn Resource Manager that
initialises and registers an Application master. The crucial point is Yarn
Resource manager which is basically a resource scheduler. It optimizes for
cluster resource utilization to keep all resources in use all the time.
However, resource manager itself is on the resource manager node.

Now I always start my Spark app on the same node as the resource manager
node and let Yarn take care of the rest.

Thanks

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 7 June 2016 at 12:17, Jacek Laskowski <ja...@japila.pl> wrote:

> Hi,
>
> It's not possible. YARN uses CPU and memory for resource constraints and
> places AM on any node available. Same about executors (unless data locality
> constraints the placement).
>
> Jacek
> On 6 Jun 2016 1:54 a.m., "Saiph Kappa" <saiph.ka...@gmail.com> wrote:
>
>> Hi,
>>
>> In yarn-cluster mode, is there any way to specify on which node I want
>> the driver to run?
>>
>> Thanks.
>>
>

Reply via email to