About deployment/serving

SPIP
https://issues.apache.org/jira/browse/SPARK-26247


________________________________
From: Riccardo Ferrari <ferra...@gmail.com>
Sent: Tuesday, January 22, 2019 8:07 AM
To: User
Subject: I have trained a ML model, now what?

Hi list!

I am writing here to here about your experience on putting Spark ML models into 
production at scale.

I know it is a very broad topic with many different faces depending on the 
use-case, requirements, user base and whatever is involved in the task. Still 
I'd like to open a thread about this topic that is as important as properly 
training a model and I feel is often neglected.

The task is serving web users with predictions and the main challenge I see is 
making it agile and swift.

I think there are mainly 3 general categories of such deployment that can be 
described as:

  *   Offline/Batch: Load a model, performs the inference, store the results in 
some datasotre (DB, indexes,...)
  *   Spark in the loop: Having a long running Spark context exposed in some 
way, this include streaming as well as some custom application that wraps the 
context.
  *   Use a different technology to load the Spark MLlib model and run the 
inference pipeline. I have read about MLeap and other PMML based solutions.

I would love to hear about opensource solutions and possibly without requiring 
cloud provider specific framework/component.

Again I am aware each of the previous category have benefits and drawback, so 
what would you pick? Why? and how?

Thanks!

Reply via email to