I'm fairly new to Spark and Mllib, but i'm doing some research into multi
tenancy of mllib based app. The idea is to provide ability to train models
on demand with certain constraints (executor size) and then allow to serve
predictions from those models via a REST layer.

So far from my research I've gathered the following:

1) It's fairly easy to schedule training jobs and define the size of the
executor of the job with something like spark job server or via cmd. I'd
imagine you need separate contexts here anyways, because if theres one big
context shared amongst different tenants, it wont allow training different
models in parallel for the most part. So the solution here seems a context
per tenant and training via Spark Job Server.

2) Second part seems a bit more tricky as it must expose the results of the
trained models to the outside world via some form of API. So far I've been
able to create a new context inside of a simple Spring REST application,
load the persisted model and be able to call predict and return results.

My main problem with this approach is that now I need to load the whole
spark context for each single model instance and a single tenant can
potentially have a bunch, which also means at least a JVM per tenant and
this is quite wasteful. It seems the actual prediction part is fairly simple
and I was wondering if there was a way to share multiple models to predict
from on the same context. Would that allow parallel predictions (ie model B
doesnt have to wait for a prediction of model A to complete in order to
return).

Given this simple scenario do you see a better approach to architect that,
maybe I'm missing certain features of Spark that would facilitate it in a
cleaner and more efficient manner.

Thanks!



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Multi-tenancy-REST-and-MLlib-tp25979.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to