Re: Specify node where driver should run

2016-06-07 Thread Mich Talebzadeh
Thanks. This is getting a bit confusing. I have these modes for using Spark. 1. Spark local All on the same host --> -master local[n]l.. No need to start master and slaves. Uses resources as you submit the job. 2. Spark Standalone. Use a simple cluster manager included with Spark

Re: Specify node where driver should run

2016-06-07 Thread Sebastian Piu
If you run that job then the driver will ALWAYS run in the machine from where you are issuing the spark submit command (E.g. some edge node with the clients installed). No matter where the resource manager is running. If you change yarn-client for yarn-cluster then your driver will start

Re: Specify node where driver should run

2016-06-07 Thread Jacek Laskowski
Hi, --master yarn-client is deprecated and you should use --master yarn --deploy-mode client instead. There are two deploy-modes: client (default) and cluster. See http://spark.apache.org/docs/latest/cluster-overview.html. Pozdrawiam, Jacek Laskowski https://medium.com/@jaceklaskowski/

Re: Specify node where driver should run

2016-06-07 Thread Mich Talebzadeh
ok thanks so I start SparkSubmit or similar Spark app on the Yarn resource manager node. What you are stating is that Yan may decide to start the driver program in another node as opposed to the resource manager node ${SPARK_HOME}/bin/spark-submit \ --driver-memory=4G \

Re: Specify node where driver should run

2016-06-07 Thread Sebastian Piu
What you are explaining is right for yarn-client mode, but the question is about yarn-cluster in which case the spark driver is also submitted and run in one of the node managers On Tue, 7 Jun 2016, 13:45 Mich Talebzadeh, wrote: > can you elaborate on the above

Re: Specify node where driver should run

2016-06-07 Thread Mich Talebzadeh
can you elaborate on the above statement please. When you start yarn you start the resource manager daemon only on the resource manager node yarn-daemon.sh start resourcemanager Then you start nodemanager deamons on all nodes yarn-daemon.sh start nodemanager A spark app has to start

Re: Specify node where driver should run

2016-06-07 Thread Jacek Laskowski
Hi, It's not possible. YARN uses CPU and memory for resource constraints and places AM on any node available. Same about executors (unless data locality constraints the placement). Jacek On 6 Jun 2016 1:54 a.m., "Saiph Kappa" wrote: > Hi, > > In yarn-cluster mode, is

Re: Specify node where driver should run

2016-06-07 Thread Mich Talebzadeh
by default the driver will start where you have started sbin/start-master.sh. that is where you start you app SparkSubmit. The slaves have to have an entry in slaves file What is the issue here? Dr Mich Talebzadeh LinkedIn *

Re: Specify node where driver should run

2016-06-06 Thread Bryan Cutler
I'm not an expert on YARN so anyone please correct me if I'm wrong, but I believe the Resource Manager will schedule the application to be run on the AM of any node that has a Node Manager, depending on available resources. So you would normally query the RM via the REST API to determine that.

Re: Specify node where driver should run

2016-06-06 Thread Saiph Kappa
How can I specify the node where application master should run in the yarn conf? I haven't found any useful information regarding that. Thanks. On Mon, Jun 6, 2016 at 4:52 PM, Bryan Cutler wrote: > In that mode, it will run on the application master, whichever node that > is

Re: Specify node where driver should run

2016-06-06 Thread Bryan Cutler
In that mode, it will run on the application master, whichever node that is as specified in your yarn conf. On Jun 5, 2016 4:54 PM, "Saiph Kappa" wrote: > Hi, > > In yarn-cluster mode, is there any way to specify on which node I want the > driver to run? > > Thanks. >

Specify node where driver should run

2016-06-05 Thread Saiph Kappa
Hi, In yarn-cluster mode, is there any way to specify on which node I want the driver to run? Thanks.