In the thread below I answered this. 

" Note that the PredictionServer should be configured to know how to connect to 
Elasticsearch and HBase and optionally HDFS, only Spark needs to be local. Note 
also that no config in pio-env.sh needs to change, Spark local setup is done in 
the Spark conf, it has nothing to do with PIO setup.””



On Mar 30, 2017, at 11:14 AM, Marius Rabenarivo <mariusrabenar...@gmail.com> 
wrote:

For the host where we run the training, do we have to put the path to 
ES_CONF_DIR and HADOOP_CONF_DIR in pio-env.sh even if we use remote ES and 
Hadoop clusters?

2017-03-30 22:09 GMT+04:00 Marius Rabenarivo <mariusrabenar...@gmail.com 
<mailto:mariusrabenar...@gmail.com>>:
Replace Haddop by Hadoop in the previous mail

2017-03-30 22:08 GMT+04:00 Marius Rabenarivo <mariusrabenar...@gmail.com 
<mailto:mariusrabenar...@gmail.com>>:
For the host where we run the training, do we have to put the path to 
ES_CONF_DIR and HADOOP_CONF_DIR in pio-env.sh even if we use remote ES and 
Haddop clulsters?

2017-03-30 21:58 GMT+04:00 Pat Ferrel <p...@occamsmachete.com 
<mailto:p...@occamsmachete.com>>:
To run locally in the same process as pio delete those files and do not launch 
Spark as a daemon, only use PIO commands.

We do not “re-deploy” we hot-swap the model that predictions are made from so 
the existing deployment works with the new data automatically and without any 
down-time.

Re-deploying means stopping the deployed process and restarting it. This is 
never necessary with the UR unless engine.json config is changed.


On Mar 30, 2017, at 12:47 AM, Bruno LEBON <b.le...@redfakir.fr 
<mailto:b.le...@redfakir.fr>> wrote:

"Spark local setup is done in the Spark conf, it has nothing to do with PIO 
setup.  "

Hi Pat,

So when you say the above, which files do you refer to? the "masters" and 
"slaves" files ? So I should put localhost in those files instead of the dns 
names I configured in /etc/hosts?
Once this is done, I'll be able to launch 
"nohup pio deploy --ip 0.0.0.0 --port 8001 --event-server-port 7070 --feedback 
--accesskey 4o4Te0AzGMYsc1m0nCgaGckl0vLHfQfYIALPleFKDXoQxKpUji2RF3LlpDc7rsVd -- 
--driver-memory 1G > /dev/null 2>&1 &"
with my Spark cluster off ?

Also, I have the feeling that once the train is done, the new model is 
automatically deployed, is that so? In the template Ecommerce recommendation 
,the log was explicitly telling that the model was being deployed, whereas in 
Universal Recommender the log doesnt mention an eventual automatic deploy right 
after the train is done.

 


2017-03-29 21:25 GMT+02:00 Pat Ferrel <p...@occamsmachete.com 
<mailto:p...@occamsmachete.com>>:
The Machine running the PredictionSever should not be configured to connect to 
the Spark Cluster.

This is why I explained that we use a machine for training that is a Spark 
cluster “driver” machine. The driver machine connects to the Spark cluster but 
the PredictionServer should not. 

The PredictionServer should have default config that does not know how to 
connect to the Spark cluster. In this case it will default to running 
spark-submit to launch with MASTER=local, which puts Spark in the same process 
with the PredictionServer and you will not get the cluster error. Note that the 
PredictionServer should be configured to know how to connect to Elasticsearch 
and HBase and optionally HDFS, only Spark needs to be local. Note also that no 
config in pio-env.sh needs to change, Spark local setup is done in the Spark 
conf, it has nothing to do with PIO setup.  

After running `pio build` and `pio train` copy the UR directory to *the same 
location* on the PredictionServer. Then, with Spark setup to be local, on the 
PredictionServer machine run `pio deploy` From then on if you do not change 
`engine.json` you will have newly trained models hot-swapped into all 
PredictionServers running the UR.


On Mar 29, 2017, at 11:57 AM, Marius Rabenarivo <mariusrabenar...@gmail.com 
<mailto:mariusrabenar...@gmail.com>> wrote:

Let me be more explicit.

What I want to do is not using the host where PredictionServer will run as a 
slave in the Spark cluser.

When I do this I got "Initial job has not accepted any resources" error message.

2017-03-29 22:18 GMT+04:00 Pat Ferrel <p...@occamsmachete.com 
<mailto:p...@occamsmachete.com>>:
yes

My answer below was needlessly verbose.


On Mar 28, 2017, at 8:41 AM, Marius Rabenarivo <mariusrabenar...@gmail.com 
<mailto:mariusrabenar...@gmail.com>> wrote:

But I want to run the driver outside the server where I'll run the 
PredictionServer.

As Spark will be used only for launching there.

Is it possible to run the driver outside the host where I'll deploy the engine? 
I mean for deploying

I'm reading documentation about Spark right now for having insight on how I can 
do it but I want to know if someone has tried to do something similar.

2017-03-28 19:34 GMT+04:00 Pat Ferrel <p...@occamsmachete.com 
<mailto:p...@occamsmachete.com>>:
Spark must be installed locally (so spark-submit will work) but Spark is only 
used to launch the PredictionServer. No job is run on Spark for the UR during 
query serving.

We typically train on a Spark driver machine that is like part of the Spark 
cluster and deploy on a server separate from the Spark cluster. This is so that 
the cluster can be stopped when not training and no AWS charges are incurred. 

So yes you can and often there are good reasons to do so.

See the Spark overview here: http://actionml.com/docs/intro_to_spark 
<http://actionml.com/docs/intro_to_spark>


On Mar 27, 2017, at 11:48 PM, Marius Rabenarivo <mariusrabenar...@gmail.com 
<mailto:mariusrabenar...@gmail.com>> wrote:

Hello,

For the pio train command, I understand that I can use another machine with 
PIO, Spark Driver, Master and Worker.

But, is it possible to deploy in a machine without Spark locally installed as 
it is use spark-submit during deployment
and 
org.apache.predictionio.workflow.CreateServer
references sparkContext.

I'm using UR v0.4.2 and PredictionIO 0.10.0

Regards,

Marius

P.S. I also posted in the ActionML Google group forum : 
https://groups.google.com/forum/#!topic/actionml-user/9yNQgVIODvI 
<https://groups.google.com/forum/#!topic/actionml-user/9yNQgVIODvI>







-- 
You received this message because you are subscribed to a topic in the Google 
Groups "actionml-user" group.
To unsubscribe from this topic, visit 
https://groups.google.com/d/topic/actionml-user/9yNQgVIODvI/unsubscribe 
<https://groups.google.com/d/topic/actionml-user/9yNQgVIODvI/unsubscribe>.
To unsubscribe from this group and all its topics, send an email to 
actionml-user+unsubscr...@googlegroups.com 
<mailto:actionml-user+unsubscr...@googlegroups.com>.
To post to this group, send email to actionml-u...@googlegroups.com 
<mailto:actionml-u...@googlegroups.com>.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/actionml-user/504FC93C-1A95-4673-BCB2-1C1E8CA0D487%40occamsmachete.com
 
<https://groups.google.com/d/msgid/actionml-user/504FC93C-1A95-4673-BCB2-1C1E8CA0D487%40occamsmachete.com?utm_medium=email&utm_source=footer>.

For more options, visit https://groups.google.com/d/optout 
<https://groups.google.com/d/optout>.




-- 
You received this message because you are subscribed to the Google Groups 
"actionml-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to actionml-user+unsubscr...@googlegroups.com 
<mailto:actionml-user+unsubscr...@googlegroups.com>.
To post to this group, send email to actionml-u...@googlegroups.com 
<mailto:actionml-u...@googlegroups.com>.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/actionml-user/CAC-ATVEfJLZ0OE7fJy9dU-xDGqSNJGvA6VB6xND5vCo3rWFALg%40mail.gmail.com
 
<https://groups.google.com/d/msgid/actionml-user/CAC-ATVEfJLZ0OE7fJy9dU-xDGqSNJGvA6VB6xND5vCo3rWFALg%40mail.gmail.com?utm_medium=email&utm_source=footer>.
For more options, visit https://groups.google.com/d/optout 
<https://groups.google.com/d/optout>.

Reply via email to