I think it will be released with the upcoming release. We are still deciding how or if we modify the sbt build so I’d wait, if you can. It’s in feature/es5 but the config is also still in flux a bit.
On Mar 2, 2017, at 2:15 PM, Miller, Clifford <[email protected]> wrote: I probably should have asked if the elasticsearch 5.x compatible branch was in a state that I could clone and build it. If it is, where can I find it? On Thu, Mar 2, 2017 at 5:06 PM, Miller, Clifford <[email protected] <mailto:[email protected]>> wrote: Actually, AWS has 3 current options. 1.5, 2.3, and 5.1. So a 5.x compatible version should work. When will this 5.x compatible version be available? On Thu, Mar 2, 2017 at 5:02 PM, Pat Ferrel <[email protected] <mailto:[email protected]>> wrote: Yes, PIO uses the TransportClient and this is being deprecated by ES. PIO has a feature branch that adds support for ES5 using only the REST client. Not sure this will help though since I suspect AWS is not on ES5 yet. On Mar 2, 2017, at 1:10 PM, Miller, Clifford <[email protected] <mailto:[email protected]>> wrote: I found some old references of folks having the same issue as me. They indicated that the AWS Elasticsearch Service only supports HTTP and not TCP. If this is true then it means that AWS Elasticsearch has very limited usefulness. Has anyone else ran into this? On Thu, Mar 2, 2017 at 1:26 PM, Miller, Clifford <[email protected] <mailto:[email protected]>> wrote: I'm able run pio train although the pio train -- --master spark://your_master_url <> did not work. I'm using Spark on Yarn so I was able to get pio train -- --master yarn://URL <> to work after I copied the elastic search configuration from my CDH cluster. I'm still struggling with integrating this with AWS elasticsearch. Does anyone have an example of how this should be configured. FYI, the EC2 instance that I'm running PredictionIO on can access it from the command line: "curl -X GET <AWS Elasticsearch endpoint URL>". On Wed, Mar 1, 2017 at 11:44 AM, Donald Szeto <[email protected] <mailto:[email protected]>> wrote: Hi Clifford, To use a remote Spark cluster, use passthrough command line arguments on the CLI, e.g. pio train -- --master spark://your_master_url <> Anything after a lone -- will be passed to spark-submit verbatim. For more information try "pio help". To use a remote Elasticsearch cluster, please refer to examples in "conf/pio-env.sh" where you could find a variable to set the remote host name or IP of your ES cluster. Regards, Donald On Tue, Feb 28, 2017 at 12:57 PM Miller, Clifford <[email protected] <mailto:[email protected]>> wrote: I currently have Cloudera cluster (Hadoop, Spark, Hbase...) setup on AWS. I have PredictionIO installed on a different EC2 instance. I've been able to successfully configure it to use HDFS for model storage and to store events in Hbase from the cluster. Spark and Elasticsearch are installed locally on the PredictionIO EC2 instance. I have the following questions: How can I configure PredictionIO to utilize the Spark on the Cloudera cluster? How can I configure PredictionIO to utilize a remote Elasticsearch domain? I'd like to use the AWS Elasticsearch service if possible. Thanks -- Clifford Miller Mobile | 321.431.9089 <tel:321.431.9089> -- Clifford Miller Mobile | 321.431.9089 <tel:321.431.9089> -- Clifford Miller Mobile | 321.431.9089 <tel:321.431.9089> -- Clifford Miller Mobile | 321.431.9089 <tel:321.431.9089> -- Clifford Miller Mobile | 321.431.9089 <tel:321.431.9089>
