Re: PredictionIO with remote Spark and Elasticsearch

Donald Szeto Wed, 01 Mar 2017 08:45:26 -0800

Hi Clifford,

To use a remote Spark cluster, use passthrough command line arguments on
the CLI, e.g.

pio train -- --master spark://your_master_url

Anything after a lone -- will be passed to spark-submit verbatim. For more
information try "pio help".

To use a remote Elasticsearch cluster, please refer to examples in
"conf/pio-env.sh" where you could find a variable to set the remote host
name or IP of your ES cluster.

Regards,
Donald

On Tue, Feb 28, 2017 at 12:57 PM Miller, Clifford <
[email protected]> wrote:

> I currently have Cloudera cluster (Hadoop, Spark, Hbase...) setup on AWS.
> I have PredictionIO installed on a different EC2 instance.  I've been able
> to successfully configure it to use HDFS for model storage and to store
> events in Hbase from the cluster.  Spark and Elasticsearch are installed
> locally on the PredictionIO EC2 instance.  I have the following questions:
>
> How can I configure PredictionIO to utilize the Spark on the Cloudera
> cluster?
> How can I configure PredictionIO to utilize a remote Elasticsearch
> domain?  I'd like to use the AWS Elasticsearch service if possible.
>
> Thanks
>
>
> --
> Clifford Miller
> Mobile | 321.431.9089
>

Re: PredictionIO with remote Spark and Elasticsearch

Reply via email to