Yea, that seems to be the case. It seems that dynamically resizing a standalone Spark cluster is very simple.
Thanks! On Mon, Mar 28, 2016 at 10:22 PM, Mich Talebzadeh <mich.talebza...@gmail.com > wrote: > start-all start the master and anything else in slaves file > start-master.sh starts the master only. > > I use start-slaves.sh for my purpose with added nodes to slaves file. > > When you run start-slave.sh <MASTER_IP_ADD> you are creating another > worker process on the master host. You can check the status on Spark GUI > on <HOST>:8080. Depending the ratio of Memory/core for worker process the > additional worker may or may not be used. > > > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > > On 28 March 2016 at 22:58, Sung Hwan Chung <coded...@cs.stanford.edu> > wrote: > >> It seems that the conf/slaves file is only for consumption by the >> following scripts: >> >> sbin/start-slaves.sh >> sbin/stop-slaves.sh >> sbin/start-all.sh >> sbin/stop-all.sh >> >> I.e., conf/slaves file doesn't affect a running cluster. >> >> Is this true? >> >> >> On Mon, Mar 28, 2016 at 9:31 PM, Sung Hwan Chung < >> coded...@cs.stanford.edu> wrote: >> >>> No I didn't add it to the conf/slaves file. >>> >>> What I want to do is leverage auto-scale from AWS, without needing to >>> stop all the slaves (e.g. if a lot of slaves are idle, terminate those). >>> >>> Also, the book-keeping is easier if I don't have to deal with some >>> centralized list of slave list that needs to be modified every time a node >>> is added/removed. >>> >>> >>> On Mon, Mar 28, 2016 at 9:20 PM, Mich Talebzadeh < >>> mich.talebza...@gmail.com> wrote: >>> >>>> Have you added the slave host name to $SPARK_HOME/conf? >>>> >>>> Then you can use start-slaves.sh or stop-slaves.sh for all instances >>>> >>>> The assumption is that slave boxes have $SPARK_HOME installed in the >>>> same directory as $SPARK_HOME is installed in the master. >>>> >>>> HTH >>>> >>>> >>>> Dr Mich Talebzadeh >>>> >>>> >>>> >>>> LinkedIn * >>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>>> >>>> >>>> >>>> http://talebzadehmich.wordpress.com >>>> >>>> >>>> >>>> On 28 March 2016 at 22:06, Sung Hwan Chung <coded...@cs.stanford.edu> >>>> wrote: >>>> >>>>> Hello, >>>>> >>>>> I found that I could dynamically add/remove new workers to a running >>>>> standalone Spark cluster by simply triggering: >>>>> >>>>> start-slave.sh (SPARK_MASTER_ADDR) >>>>> >>>>> and >>>>> >>>>> stop-slave.sh >>>>> >>>>> E.g., I could instantiate a new AWS instance and just add it to a >>>>> running cluster without needing to add it to slaves file and restarting >>>>> the >>>>> whole cluster. >>>>> It seems that there's no need for me to stop a running cluster. >>>>> >>>>> Is this a valid way of dynamically resizing a spark cluster (as of >>>>> now, I'm not concerned about HDFS)? Or will there be certain unforeseen >>>>> problems if nodes are added/removed this way? >>>>> >>>> >>>> >>> >> >