Re: Velox Model Server

2015-07-13 Thread Nick Pentreath
Honestly I don't believe this kind of functionality belongs within spark-jobserver. For serving of factor-type models, you are typically in the realm of recommendations or ad-serving scenarios - i.e. needing to score a user / context against many possible items and return a top-k list of those.

Re: Velox Model Server

2015-06-24 Thread Debasish Das
Model sizes are 10m x rank, 100k x rank range. For recommendation/topic modeling I can run batch recommendAll and then keep serving the model using a distributed cache but then I can't incorporate per user model re-predict if user feedback is making the current topk stale. I have to wait for next

Re: Velox Model Server

2015-06-24 Thread Nick Pentreath
Ok My view is with only 100k items, you are better off serving in-memory for items vectors. i.e. store all item vectors in memory, and compute user * item score on-demand. In most applications only a small proportion of users are active, so really you don't need all 10m user vectors in memory.

Re: Velox Model Server

2015-06-24 Thread Sean Owen
On Wed, Jun 24, 2015 at 12:02 PM, Nick Pentreath nick.pentre...@gmail.com wrote: Oryx does almost the same but Oryx1 kept all user and item vectors in memory (though I am not sure about whether Oryx2 still stores all user and item vectors in memory or partitions in some way). (Yes, this is a

Re: Velox Model Server

2015-06-24 Thread Debasish Das
Thanks Nick, Sean for the great suggestions... Since you guys have already hit these issues before I think it will be great if we can add the learning to Spark Job Server and enhance it for community. Nick, do you see any major issues in using Spray over Scalatra ? Looks like Model Server API

Re: Velox Model Server

2015-06-23 Thread Sean Owen
Yes, and typically needs are 100ms. Now imagine even 10 concurrent requests. My experience has been that this approach won't nearly scale. The best you could probably do is async mini-batch near-real-time scoring, pushing results to some store for retrieval, which could be entirely suitable for

Re: Velox Model Server

2015-06-22 Thread Debasish Das
Models that I am looking for are mostly factorization based models (which includes both recommendation and topic modeling use-cases). For recommendation models, I need a combination of Spark SQL and ml model prediction api...I think spark job server is what I am looking for and it has fast http

Re: Velox Model Server

2015-06-22 Thread Nick Pentreath
How large are your models? Spark job server does allow synchronous job execution and with a warm long-lived context it will be quite fast - but still in the order of a second or a few seconds usually (depending on model size - for very large models possibly quite a lot more than that).

Re: Velox Model Server

2015-06-21 Thread Sean Owen
Out of curiosity why netty? What model are you serving? Velox doesn't look like it is optimized for cases like ALS recs, if that's what you mean. I think scoring ALS at scale in real time takes a fairly different approach. The servlet engine probably doesn't matter at all in comparison. On Sat,

Re: Velox Model Server

2015-06-21 Thread Stephen Boesch
Oryx 2 has a scala client https://github.com/OryxProject/oryx/blob/master/framework/oryx-api/src/main/scala/com/cloudera/oryx/api/ 2015-06-20 11:39 GMT-07:00 Debasish Das debasish.da...@gmail.com: After getting used to Scala, writing Java is too much work :-) I am looking for scala based

Re: Velox Model Server

2015-06-21 Thread Nick Pentreath
Is there a presentation up about this end-to-end example? I'm looking into velox now - our internal model pipeline just saves factors to S3 and model server loads them periodically from S3 — Sent from Mailbox On Sat, Jun 20, 2015 at 9:46 PM, Debasish Das debasish.da...@gmail.com wrote:

Re: Velox Model Server

2015-06-20 Thread Charles Earl
Is velox NOT open source? On Saturday, June 20, 2015, Debasish Das debasish.da...@gmail.com wrote: Hi, The demo of end-to-end ML pipeline including the model server component at Spark Summit was really cool. I was wondering if the Model Server component is based upon Velox or it uses a

Re: Velox Model Server

2015-06-20 Thread Sandy Ryza
Hi Debasish, The Oryx project (https://github.com/cloudera/oryx), which is Apache 2 licensed, contains a model server that can serve models built with MLlib. -Sandy On Sat, Jun 20, 2015 at 8:00 AM, Charles Earl charles.ce...@gmail.com wrote: Is velox NOT open source? On Saturday, June 20,

Velox Model Server

2015-06-20 Thread Debasish Das
Hi, The demo of end-to-end ML pipeline including the model server component at Spark Summit was really cool. I was wondering if the Model Server component is based upon Velox or it uses a completely different architecture. https://github.com/amplab/velox-modelserver We are looking for an open

Re: Velox Model Server

2015-06-20 Thread Donald Szeto
Mind if I ask what 1.3/1.4 ML features that you are looking for? On Saturday, June 20, 2015, Debasish Das debasish.da...@gmail.com wrote: After getting used to Scala, writing Java is too much work :-) I am looking for scala based project that's using netty at its core (spray is one example).

Re: Velox Model Server

2015-06-20 Thread Debasish Das
Integration of model server with ML pipeline API. On Sat, Jun 20, 2015 at 12:25 PM, Donald Szeto don...@prediction.io wrote: Mind if I ask what 1.3/1.4 ML features that you are looking for? On Saturday, June 20, 2015, Debasish Das debasish.da...@gmail.com wrote: After getting used to

Re: Velox Model Server

2015-06-20 Thread Sandy Ryza
Oops, that link was for Oryx 1. Here's the repo for Oryx 2: https://github.com/OryxProject/oryx On Sat, Jun 20, 2015 at 10:20 AM, Sandy Ryza sandy.r...@cloudera.com wrote: Hi Debasish, The Oryx project (https://github.com/cloudera/oryx), which is Apache 2 licensed, contains a model server

Re: Velox Model Server

2015-06-20 Thread Debasish Das
After getting used to Scala, writing Java is too much work :-) I am looking for scala based project that's using netty at its core (spray is one example). prediction.io is an option but that also looks quite complicated and not using all the ML features that got added in 1.3/1.4 Velox built on