I'm fairly new to Spark and Mllib, but i'm doing some research into multi tenancy of mllib based app. The idea is to provide ability to train models on demand with certain constraints (executor size) and then allow to serve predictions from those models via a REST layer.
So far from my research I've gathered the following: 1) It's fairly easy to schedule training jobs and define the size of the executor of the job with something like spark job server or via cmd. I'd imagine you need separate contexts here anyways, because if theres one big context shared amongst different tenants, it wont allow training different models in parallel for the most part. So the solution here seems a context per tenant and training via Spark Job Server. 2) Second part seems a bit more tricky as it must expose the results of the trained models to the outside world via some form of API. So far I've been able to create a new context inside of a simple Spring REST application, load the persisted model and be able to call predict and return results. My main problem with this approach is that now I need to load the whole spark context for each single model instance and a single tenant can potentially have a bunch, which also means at least a JVM per tenant and this is quite wasteful. It seems the actual prediction part is fairly simple and I was wondering if there was a way to share multiple models to predict from on the same context. Would that allow parallel predictions (ie model B doesnt have to wait for a prediction of model A to complete in order to return). Given this simple scenario do you see a better approach to architect that, maybe I'm missing certain features of Spark that would facilitate it in a cleaner and more efficient manner. Thanks! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Multi-tenancy-REST-and-MLlib-tp25979.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org