Re: Why are executors on slave never used?
Thank you Hemant and Andrew, I got it working. On Mon, Sep 21, 2015 at 11:48 PM, Andrew Orwrote: > Hi Joshua, > > What cluster manager are you using, standalone or YARN? (Note that > standalone here does not mean local mode). > > If standalone, you need to do `setMaster("spark://[CLUSTER_URL]:7077")`, > where CLUSTER_URL is the machine that started the standalone Master. If > YARN, you need to do `setMaster("yarn")`, assuming that all the Hadoop > configurations files such as core-site.xml are already set up properly. > > -Andrew > > > 2015-09-21 8:53 GMT-07:00 Hemant Bhanawat : > >> When you specify master as local[2], it starts the spark components in a >> single jvm. You need to specify the master correctly. >> I have a default AWS EMR cluster (1 master, 1 slave) with Spark. When I >> run a Spark process, it works fine -- but only on the master, as if it were >> standalone. >> >> The web-UI and logging code shows only 1 executor, the localhost. >> >> How can I diagnose this? >> >> (I create *SparkConf, *in Python, with *setMaster('local[2]'). )* >> >> (Strangely, though I don't think that this causes the problem, there is >> almost nothing spark-related on the slave machine:* /usr/lib/spark *has >> a few jars, but that's it: *datanucleus-api-jdo.jar >> datanucleus-core.jar datanucleus-rdbms.jar spark-yarn-shuffle.jar. *But >> this is an AWS EMR cluster as created by* create-cluster*, so I would >> assume that the slave and master are configured OK out-of the box.) >> >> Joshua >> > >
Re: Why are executors on slave never used?
When you specify master as local[2], it starts the spark components in a single jvm. You need to specify the master correctly. I have a default AWS EMR cluster (1 master, 1 slave) with Spark. When I run a Spark process, it works fine -- but only on the master, as if it were standalone. The web-UI and logging code shows only 1 executor, the localhost. How can I diagnose this? (I create *SparkConf, *in Python, with *setMaster('local[2]'). )* (Strangely, though I don't think that this causes the problem, there is almost nothing spark-related on the slave machine:* /usr/lib/spark *has a few jars, but that's it: *datanucleus-api-jdo.jar datanucleus-core.jar datanucleus-rdbms.jar spark-yarn-shuffle.jar. *But this is an AWS EMR cluster as created by* create-cluster*, so I would assume that the slave and master are configured OK out-of the box.) Joshua
Re: Why are executors on slave never used?
Hi Joshua, What cluster manager are you using, standalone or YARN? (Note that standalone here does not mean local mode). If standalone, you need to do `setMaster("spark://[CLUSTER_URL]:7077")`, where CLUSTER_URL is the machine that started the standalone Master. If YARN, you need to do `setMaster("yarn")`, assuming that all the Hadoop configurations files such as core-site.xml are already set up properly. -Andrew 2015-09-21 8:53 GMT-07:00 Hemant Bhanawat: > When you specify master as local[2], it starts the spark components in a > single jvm. You need to specify the master correctly. > I have a default AWS EMR cluster (1 master, 1 slave) with Spark. When I > run a Spark process, it works fine -- but only on the master, as if it were > standalone. > > The web-UI and logging code shows only 1 executor, the localhost. > > How can I diagnose this? > > (I create *SparkConf, *in Python, with *setMaster('local[2]'). )* > > (Strangely, though I don't think that this causes the problem, there is > almost nothing spark-related on the slave machine:* /usr/lib/spark *has a > few jars, but that's it: *datanucleus-api-jdo.jar datanucleus-core.jar > datanucleus-rdbms.jar spark-yarn-shuffle.jar. *But this is an AWS EMR > cluster as created by* create-cluster*, so I would assume that the slave > and master are configured OK out-of the box.) > > Joshua >