Hi Shashidhar, Our team at PredictionIO is trying to solve the production deployment of model. We built a powered-by-Spark framework (also certified on Spark by Databricks) that allows a user to build models with everything available from the Spark API, persist the model automatically with versioning, and deploy as a REST service using simple CLI commands.
Regarding model degeneration and updates, if having a half to couple seconds downtime is acceptable, with PIO one could simply run "pio train" and "pio deploy" periodically with a cronjob. To achieve virtually zero downtime, a load balancer could be setup in front of 2 "pio deploy" instances. Porting your current algorithm / model generation to PredictionIO should just be a copy-and-paste procedure. We would be very grateful for any feedback that would improve the deployment process. We do not support PMML at the moment, but definitely are interested in your use case. You may get started with the documentation (http://docs.prediction.io/). You could also visit the engine template gallery ( https://templates.prediction.io/) for quick, ready-to-use examples. Prediction is open source software under APL2 on https://github.com/PredictionIO/PredictionIO. Looking forward to hearing your feedback! Best Regards, Donald ᐧ On Sat, Mar 21, 2015 at 10:40 AM, Shashidhar Rao <raoshashidhar...@gmail.com > wrote: > Hi, > > Apologies for the generic question. > > As I am developing predictive models for the first time and soon model > will be deployed in production very soon. > > Could somebody help me with the model deployment in production , I have > read quite a few on model deployment and have read some books on Database > deployment . > > My queries relate to how updates to model happen when current model > degenerates without any downtime and how others are deploying in production > servers and a few lines on adoption of PMML currently in production. > > Please provide me with some good links or some forums so that I can > learn as most of the books do not cover it extensively except for 'Mahout > in action' where it is explained in some detail and have also checked > stackoverflow but have not got any relevant answers. > > What I understand: > 1. Build model using current training set and test the model. > 2. Deploy the model,put it in some location and load it and predict when > request comes for scoring. > 3. Model degenerates , now build new model with new data.(Here some > confusion , whether the old data is discarded completely or it is done with > purely new data or a mix) > 4. Here I am stuck , how to update the model without any downtime, the > transition period when old model and new model happens. > > My naive solution would be, build the new model , save it in a new > location and update the new path in some properties file or update the > location in database when the saving is done. Is this correct or some best > practices are available. > Database is unlikely in my case. > > Thanks in advance. > > > > -- Donald Szeto PredictionIO