spark-shell for standalone mode

Andrew Or Tue, 08 Jul 2014 16:46:33 -0700

>> "The proper way to specify this is through "spark.master" in your config
or the "--master" parameter to spark-submit."


By "this" I mean configuring which master the driver connects to (not which
port and address the standalone Master binds to).


2014-07-08 16:43 GMT-07:00 Andrew Or <and...@databricks.com>:

> Hi Mikhail,
>
> It looks like the documentation is a little out-dated. Neither is true
> anymore. In general, we try to shift away from short options ("-em", "-dm"
> etc.) in favor of more explicit ones ("--executor-memory",
> "--driver-memory"). These options, and "--cores", refer to the arguments
> passed in to org.apache.spark.deploy.Client, the now deprecated way of
> launching an application in standalone clusters.
>
> SPARK_MASTER_IP/PORT are only used for binding the Master, not to
> configure which master the driver connects to. The proper way to specify
> this is through "spark.master" in your config or the "--master" parameter
> to spark-submit.
>
> We will update the documentation shortly. Thanks for letting us know.
> Andrew
>
>
>
> 2014-07-08 16:29 GMT-07:00 Mikhail Strebkov <streb...@gmail.com>:
>
> Hi! I've been using Spark compiled from 1.0 branch at some point (~2 month
>> ago). The setup is a standalone cluster with 4 worker machines and 1
>> master
>> machine. I used to run spark shell like this:
>>
>>   ./bin/spark-shell -c 30 -em 20g -dm 10g
>>
>> Today I've finally updated to Spark 1.0 release. Now I can only run spark
>> shell like this:
>>
>>   ./bin/spark-shell --master spark://10.2.1.5:7077
>> --total-executor-cores 30
>> --executor-memory 20g --driver-memory 10g
>>
>> The documentation at
>> http://spark.apache.org/docs/latest/spark-standalone.html says:
>>
>> "You can also pass an option --cores <numCores> to control the number of
>> cores that spark-shell uses on the cluster."
>> This doesn't work, you need to pass "--total-executor-cores <numCores>"
>> instead.
>>
>> "Note that if you are running spark-shell from one of the spark cluster
>> machines, the bin/spark-shell script will automatically set MASTER from
>> the
>> SPARK_MASTER_IP and SPARK_MASTER_PORT variables in conf/spark-env.sh."
>> This is not working for me too. I run the shell from the master machine,
>> and
>> I do have SPARK_MASTER_IP set up in conf/spark-env.sh like this:
>> export SPARK_MASTER_IP='10.2.1.5'
>> But if I omit "--master spark://10.2.1.5:7077" then the console starts
>> but I
>> can't see it in the UI at http://10.2.1.5:8080. But when I go to
>> http://10.2.1.5:4040 (the application UI) I see that the app is using
>> only
>> master as a worker.
>>
>> My question is: are those just the bugs in the documentation? That there
>> is
>> no --cores option and that SPARK_MASTER_IP is not used anymore when I run
>> the Spark shell from the master?
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/issues-with-bin-spark-shell-for-standalone-mode-tp9107.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>
>

Re: issues with ./bin/spark-shell for standalone mode

Reply via email to