Re: Why are executors on slave never used?

2015-09-22 Thread Joshua Fox
Thank you Hemant and Andrew, I got it working.

On Mon, Sep 21, 2015 at 11:48 PM, Andrew Or  wrote:

> Hi Joshua,
>
> What cluster manager are you using, standalone or YARN? (Note that
> standalone here does not mean local mode).
>
> If standalone, you need to do `setMaster("spark://[CLUSTER_URL]:7077")`,
> where CLUSTER_URL is the machine that started the standalone Master. If
> YARN, you need to do `setMaster("yarn")`, assuming that all the Hadoop
> configurations files such as core-site.xml are already set up properly.
>
> -Andrew
>
>
> 2015-09-21 8:53 GMT-07:00 Hemant Bhanawat :
>
>> When you specify master as local[2], it starts the spark components in a
>> single jvm. You need to specify the master correctly.
>> I have a default AWS EMR cluster (1 master, 1 slave) with Spark. When I
>> run a Spark process, it works fine -- but only on the master, as if it were
>> standalone.
>>
>> The web-UI and logging code shows only 1 executor, the localhost.
>>
>> How can I diagnose this?
>>
>> (I create *SparkConf, *in Python, with *setMaster('local[2]'). )*
>>
>> (Strangely, though I don't think that this causes the problem, there is
>> almost nothing spark-related on the slave machine:* /usr/lib/spark *has
>> a few jars, but that's it:  *datanucleus-api-jdo.jar
>>  datanucleus-core.jar  datanucleus-rdbms.jar  spark-yarn-shuffle.jar. *But
>> this is an AWS EMR cluster as created by* create-cluster*, so I would
>> assume that the slave and master are configured OK out-of the box.)
>>
>> Joshua
>>
>
>


Re: Why are executors on slave never used?

2015-09-21 Thread Hemant Bhanawat
When you specify master as local[2], it starts the spark components in a
single jvm. You need to specify the master correctly.
I have a default AWS EMR cluster (1 master, 1 slave) with Spark. When I run
a Spark process, it works fine -- but only on the master, as if it were
standalone.

The web-UI and logging code shows only 1 executor, the localhost.

How can I diagnose this?

(I create *SparkConf, *in Python, with *setMaster('local[2]'). )*

(Strangely, though I don't think that this causes the problem, there is
almost nothing spark-related on the slave machine:* /usr/lib/spark *has a
few jars, but that's it:  *datanucleus-api-jdo.jar  datanucleus-core.jar
 datanucleus-rdbms.jar  spark-yarn-shuffle.jar. *But this is an AWS EMR
cluster as created by* create-cluster*, so I would assume that the slave
and master are configured OK out-of the box.)

Joshua


Re: Why are executors on slave never used?

2015-09-21 Thread Andrew Or
Hi Joshua,

What cluster manager are you using, standalone or YARN? (Note that
standalone here does not mean local mode).

If standalone, you need to do `setMaster("spark://[CLUSTER_URL]:7077")`,
where CLUSTER_URL is the machine that started the standalone Master. If
YARN, you need to do `setMaster("yarn")`, assuming that all the Hadoop
configurations files such as core-site.xml are already set up properly.

-Andrew


2015-09-21 8:53 GMT-07:00 Hemant Bhanawat :

> When you specify master as local[2], it starts the spark components in a
> single jvm. You need to specify the master correctly.
> I have a default AWS EMR cluster (1 master, 1 slave) with Spark. When I
> run a Spark process, it works fine -- but only on the master, as if it were
> standalone.
>
> The web-UI and logging code shows only 1 executor, the localhost.
>
> How can I diagnose this?
>
> (I create *SparkConf, *in Python, with *setMaster('local[2]'). )*
>
> (Strangely, though I don't think that this causes the problem, there is
> almost nothing spark-related on the slave machine:* /usr/lib/spark *has a
> few jars, but that's it:  *datanucleus-api-jdo.jar  datanucleus-core.jar
>  datanucleus-rdbms.jar  spark-yarn-shuffle.jar. *But this is an AWS EMR
> cluster as created by* create-cluster*, so I would assume that the slave
> and master are configured OK out-of the box.)
>
> Joshua
>