Time taken for PIO Training

2017-11-21 Thread Abhimanyu Nagrath
Hi, I am using pio V0.12.0 (Hbase 1.2.6, Elasticsearch 5.2.1, Spark 2.6.1). I am using template > https://github.com/EmergentOrder/template-scala-probabilistic-classifier-batch-lbfgs I spawned two servers each having configuration(244 GB RAM, 16 Cores). On 1 server I uploaded 1 million events wit

Re: pio batchpredict fix & changes

2017-11-21 Thread Mars Hall
Howdy Donald, Indeed this PR trades batch prediction scalability (via Spark) for compatibility (with all models). I'm not convinced this is good trade-off. I also had a variation of the PR working that simply reuses the model's SparkContext for queries, if it hasn't already been stopped. That's w

unsubscribe

2017-11-21 Thread Ali, Syed
Thank you _ Syed Y Ali Principal Architect | Digital Engineering and Mobile Solutions Walgreen Co. | 1419 Lake Cook Rd, 2nd Floor, MS# L292, Deerfield, IL 60645 Telephone 847 964 8727 | Mobile 224 226 6305 Member of Walgreens Boots Alliance This email messa

Re: "pio app delete" breaks my PIO installation

2017-11-21 Thread Pat Ferrel
I’ve seen this happen on my dev machine (a mac laptop) when I use `pio-start-all` which I never use in production. My test for everything running includes pio status: this only tests the metadata service and the main store (Elasticsearch and HBase in my case) It by no means examines all service

Re: "pio app delete" breaks my PIO installation

2017-11-21 Thread Donald Szeto
Hey Noelia, What event storage backend are you using? Is the backend storage stuck? This could happen very often with a local, single-node HBase installation. Regards, Donald On Mon, Nov 13, 2017 at 7:51 AM, Noelia Osés Fernández wrote: > I forgot to mention that *pio status* reports my system

Re: Pio eventserver returning incorrect result

2017-11-21 Thread Donald Szeto
Hi Rasna, Sorry for the late reply. The event server models log data as a series of immutable events and thus do not support updates (except deletes) of entity IDs. The event server itself does not maintain any cache between its REST API to the backend storage (HBase, etc). It is simply an adapter

Re: Hardware Configuration for Binary Classification using PIO

2017-11-21 Thread Donald Szeto
Hi Sachin, 1. I would highly encourage you to adopt the template, and upgrade and maintain it to track future PIO releases if that's something you like to do. Otherwise, you may want to consider following http://predictionio. apache.org/templates/classification/quickstart/ and see if your use case

Re: pio batchpredict fix & changes

2017-11-21 Thread Donald Szeto
Hi Mars, Thanks for the PR! I am still reviewing the code change, but at the high level it will take away the ability to run "batchpredict" remotely on a Spark cluster + HDFS/S3 setup, and requires extra steps of downloading input and uploading output files for such setup. It will unlikely scale t

Re: Log-likelihood based correlation test?

2017-11-21 Thread Pat Ferrel
No PtP non-zero elements have LLR calculated. The highest scores in the row are kept, or ones above some threshold hte resst are removeda as “noise". These are put into the Elasticsearch model without scores. Elasticsearch compares the similarity of the user history to each item in the model t

Re: Log-likelihood based correlation test?

2017-11-21 Thread Noelia Osés Fernández
Pat, If I understood your explanation correctly, you say that some elements of PtP are removed by the LLR (set to zero, to be precise). But the elements that survive are calculated by matrix multiplication. The final PtP is put into EleasticSearc and when we query for user recommendations ES uses