this is exactly what my project is addressing.  check it out 
and send me feedback or create issues at that github location.

> On Aug 11, 2016, at 7:42 AM, Nicholas Chammas <> 
> wrote:
> Thanks Michael for the reference, and thanks Nick for the comprehensive 
> overview of existing JIRA discussions about this. I've added myself as a 
> watcher on the various tasks.
>> On Thu, Aug 11, 2016 at 3:02 AM Nick Pentreath <> 
>> wrote:
>> Currently there is no direct way in Spark to serve models without bringing 
>> in all of Spark as a dependency.
>> For Spark ML, there is actually no way to do it independently of DataFrames 
>> either (which for single-instance prediction makes things sub-optimal). That 
>> is covered here:
>> So, your options are (in Scala) things like MLeap, PredictionIO, or "roll 
>> your own". Or you can try to export to some other format such as PMML or 
>> PFA. Some MLlib models support PMML export, but for ML it is still missing 
>> (see
>> There is an external project for PMML too (note licensing) - 
>> - which is by now actually quite 
>> comprehensive. It shows that PMML can represent a pretty large subset of 
>> typical ML pipeline functionality.
>> On the Python side sadly there is even less - I would say your options are 
>> pretty much "roll your own" currently, or export in PMML or PFA.
>> Finally, part of the "mllib-local" idea was around enabling this local 
>> model-serving (for some initial discussion about the future see 
>> N
>>> On Thu, 11 Aug 2016 at 06:28 Michael Allman <> wrote:
>>> Nick,
>>> Check out MLeap: It's not python, but we 
>>> use it in production to serve a random forest model trained by a Spark ML 
>>> pipeline.
>>> Thanks,
>>> Michael
>>>> On Aug 10, 2016, at 7:50 PM, Nicholas Chammas <> 
>>>> wrote:
>>>> Are there any existing JIRAs covering the possibility of serving up Spark 
>>>> ML models via, for example, a regular Python web app?
>>>> The story goes like this: You train your model with Spark on several TB of 
>>>> data, and now you want to use it in a prediction service that you’re 
>>>> building, say with Flask. In principle, you don’t need Spark anymore since 
>>>> you’re just passing individual data points to your model and looking for 
>>>> it to spit some prediction back.
>>>> I assume this is something people do today, right? I presume Spark needs 
>>>> to run in their web service to serve up the model. (Sorry, I’m new to the 
>>>> ML side of Spark. 😅)
>>>> Are there any JIRAs discussing potential improvements to this story? I did 
>>>> a search, but I’m not sure what exactly to look for. SPARK-4587 (model 
>>>> import/export) looks relevant, but doesn’t address the story directly.
>>>> Nick

Reply via email to