Nick, Check out MLeap: https://github.com/TrueCar/mleap <https://github.com/TrueCar/mleap>. It's not python, but we use it in production to serve a random forest model trained by a Spark ML pipeline.
Thanks, Michael > On Aug 10, 2016, at 7:50 PM, Nicholas Chammas <nicholas.cham...@gmail.com> > wrote: > > Are there any existing JIRAs covering the possibility of serving up Spark ML > models via, for example, a regular Python web app? > > The story goes like this: You train your model with Spark on several TB of > data, and now you want to use it in a prediction service that you’re > building, say with Flask <http://flask.pocoo.org/>. In principle, you don’t > need Spark anymore since you’re just passing individual data points to your > model and looking for it to spit some prediction back. > > I assume this is something people do today, right? I presume Spark needs to > run in their web service to serve up the model. (Sorry, I’m new to the ML > side of Spark. 😅) > > Are there any JIRAs discussing potential improvements to this story? I did a > search, but I’m not sure what exactly to look for. SPARK-4587 > <https://issues.apache.org/jira/browse/SPARK-4587> (model import/export) > looks relevant, but doesn’t address the story directly. > > Nick >