For the host where we run the training, do we have to put the path to ES_CONF_DIR and HADOOP_CONF_DIR in pio-env.sh even if we use remote ES and Hadoop clusters?
2017-03-30 22:09 GMT+04:00 Marius Rabenarivo <mariusrabenar...@gmail.com>: > Replace Haddop by Hadoop in the previous mail > > 2017-03-30 22:08 GMT+04:00 Marius Rabenarivo <mariusrabenar...@gmail.com>: > >> For the host where we run the training, do we have to put the path to >> ES_CONF_DIR and HADOOP_CONF_DIR in pio-env.sh even if we use remote ES >> and Haddop clulsters? >> >> 2017-03-30 21:58 GMT+04:00 Pat Ferrel <p...@occamsmachete.com>: >> >>> To run locally in the same process as pio delete those files and do not >>> launch Spark as a daemon, only use PIO commands. >>> >>> We do not “re-deploy” we hot-swap the model that predictions are made >>> from so the existing deployment works with the new data automatically and >>> without any down-time. >>> >>> Re-deploying means stopping the deployed process and restarting it. This >>> is never necessary with the UR unless engine.json config is changed. >>> >>> >>> On Mar 30, 2017, at 12:47 AM, Bruno LEBON <b.le...@redfakir.fr> wrote: >>> >>> "Spark local setup is done in the Spark conf, it has nothing to do with >>> PIO setup. " >>> >>> Hi Pat, >>> >>> So when you say the above, which files do you refer to? the "masters" >>> and "slaves" files ? So I should put localhost in those files instead of >>> the dns names I configured in /etc/hosts? >>> Once this is done, I'll be able to launch >>> "nohup pio deploy --ip 0.0.0.0 --port 8001 --event-server-port 7070 >>> --feedback --accesskey 4o4Te0AzGMYsc1m0nCgaGckl0vLHfQ >>> fYIALPleFKDXoQxKpUji2RF3LlpDc7rsVd -- --driver-memory 1G > /dev/null >>> 2>&1 &" >>> with my Spark cluster off ? >>> >>> Also, I have the feeling that once the train is done, the new model is >>> automatically deployed, is that so? In the template Ecommerce >>> recommendation ,the log was explicitly telling that the model was being >>> deployed, whereas in Universal Recommender the log doesnt mention an >>> eventual automatic deploy right after the train is done. >>> >>> >>> >>> >>> 2017-03-29 21:25 GMT+02:00 Pat Ferrel <p...@occamsmachete.com>: >>> >>>> The Machine running the PredictionSever should not be configured to >>>> connect to the Spark Cluster. >>>> >>>> This is why I explained that we use a machine for training that is a >>>> Spark cluster “driver” machine. The driver machine connects to the Spark >>>> cluster but the PredictionServer should not. >>>> >>>> The PredictionServer should have default config that does not know how >>>> to connect to the Spark cluster. In this case it will default to running >>>> spark-submit to launch with MASTER=local, which puts Spark in the same >>>> process with the PredictionServer and you will not get the cluster error. >>>> Note that the PredictionServer should be configured to know how to connect >>>> to Elasticsearch and HBase and optionally HDFS, only Spark needs to be >>>> local. Note also that no config in pio-env.sh needs to change, Spark local >>>> setup is done in the Spark conf, it has nothing to do with PIO setup. >>>> >>>> After running `pio build` and `pio train` copy the UR directory to *the >>>> same location* on the PredictionServer. Then, with Spark setup to be local, >>>> on the PredictionServer machine run `pio deploy` From then on if you do not >>>> change `engine.json` you will have newly trained models hot-swapped into >>>> all PredictionServers running the UR. >>>> >>>> >>>> On Mar 29, 2017, at 11:57 AM, Marius Rabenarivo < >>>> mariusrabenar...@gmail.com> wrote: >>>> >>>> Let me be more explicit. >>>> >>>> What I want to do is not using the host where PredictionServer will run >>>> as a slave in the Spark cluser. >>>> >>>> When I do this I got "Initial job has not accepted any resources" error >>>> message. >>>> >>>> 2017-03-29 22:18 GMT+04:00 Pat Ferrel <p...@occamsmachete.com>: >>>> >>>>> yes >>>>> >>>>> My answer below was needlessly verbose. >>>>> >>>>> >>>>> On Mar 28, 2017, at 8:41 AM, Marius Rabenarivo < >>>>> mariusrabenar...@gmail.com> wrote: >>>>> >>>>> But I want to run the driver outside the server where I'll run the >>>>> PredictionServer. >>>>> >>>>> As Spark will be used only for launching there. >>>>> >>>>> Is it possible to run the driver outside the host where I'll deploy >>>>> the engine? I mean for deploying >>>>> >>>>> I'm reading documentation about Spark right now for having insight on >>>>> how I can do it but I want to know if someone has tried to do something >>>>> similar. >>>>> >>>>> 2017-03-28 19:34 GMT+04:00 Pat Ferrel <p...@occamsmachete.com>: >>>>> >>>>>> Spark must be installed locally (so spark-submit will work) but Spark >>>>>> is only used to launch the PredictionServer. No job is run on Spark for >>>>>> the >>>>>> UR during query serving. >>>>>> >>>>>> We typically train on a Spark driver machine that is like part of the >>>>>> Spark cluster and deploy on a server separate from the Spark cluster. >>>>>> This >>>>>> is so that the cluster can be stopped when not training and no AWS >>>>>> charges >>>>>> are incurred. >>>>>> >>>>>> So yes you can and often there are good reasons to do so. >>>>>> >>>>>> See the Spark overview here: http://actionml.com/docs/intro_to_spark >>>>>> >>>>>> >>>>>> On Mar 27, 2017, at 11:48 PM, Marius Rabenarivo < >>>>>> mariusrabenar...@gmail.com> wrote: >>>>>> >>>>>> Hello, >>>>>> >>>>>> For the pio train command, I understand that I can use another >>>>>> machine with PIO, Spark Driver, Master and Worker. >>>>>> >>>>>> But, is it possible to deploy in a machine without Spark locally >>>>>> installed as it is use spark-submit during deployment >>>>>> and >>>>>> >>>>>> org.apache.predictionio.workflow.CreateServer >>>>>> >>>>>> references sparkContext. >>>>>> >>>>>> I'm using UR v0.4.2 and PredictionIO 0.10.0 >>>>>> >>>>>> Regards, >>>>>> >>>>>> Marius >>>>>> >>>>>> P.S. I also posted in the ActionML Google group forum : >>>>>> https://groups.google.com/forum/#!topic/actionml-user/9yNQgVIODvI >>>>>> >>>>>> >>>>> >>>>> >>>> >>>> >>> >>> -- >>> You received this message because you are subscribed to a topic in the >>> Google Groups "actionml-user" group. >>> To unsubscribe from this topic, visit https://groups.google.com/d/to >>> pic/actionml-user/9yNQgVIODvI/unsubscribe. >>> To unsubscribe from this group and all its topics, send an email to >>> actionml-user+unsubscr...@googlegroups.com. >>> To post to this group, send email to actionml-u...@googlegroups.com. >>> To view this discussion on the web visit https://groups.google.com/d/ms >>> gid/actionml-user/504FC93C-1A95-4673-BCB2-1C1E8CA0D487%40occ >>> amsmachete.com >>> <https://groups.google.com/d/msgid/actionml-user/504FC93C-1A95-4673-BCB2-1C1E8CA0D487%40occamsmachete.com?utm_medium=email&utm_source=footer> >>> . >>> >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> >