Re: Velox Model Server

Nick Pentreath Wed, 24 Jun 2015 02:03:07 -0700

Ok

My view is with "only" 100k items, you are better off serving in-memory for
items vectors. i.e. store all item vectors in memory, and compute user *
item score on-demand. In most applications only a small proportion of users
are active, so really you don't need all 10m user vectors in memory. They
could be looked up from a K-V store and have an LRU cache in memory for say
1m of those. Optionally also update them as feedback comes in.

As far as I can see, this is pretty much what velox does except it
partitions all user vectors across nodes to scale.

Oryx does almost the same but Oryx1 kept all user and item vectors in
memory (though I am not sure about whether Oryx2 still stores all user and
item vectors in memory or partitions in some way).

Deb, we are using a custom Akka-based model server (with Scalatra
frontend). It is more focused on many small models in-memory (largest of
these is around 5m user vectors, 100k item vectors, with factor size
20-50). We use Akka cluster sharding to allow scale-out across nodes if
required. We have a few hundred models comfortably powered by m3.xlarge AWS
instances. Using floats you could probably have all of your factors in
memory on one 64GB machine (depending on how many models you have).

Our solution is not that generic and a little hacked-together - but I'd be
happy to chat offline about sharing what we've done. I think it still has a
basic client to the Spark JobServer which would allow triggering
re-computation jobs periodically. We currently just run batch
re-computation and reload factors from S3 periodically.

We then use Elasticsearch to post-filter results and blend content-based
stuff - which I think might be more efficient than SparkSQL for this
particular purpose.

On Wed, Jun 24, 2015 at 8:59 AM, Debasish Das <debasish.da...@gmail.com>
wrote:

> Model sizes are 10m x rank, 100k x rank range.
>
> For recommendation/topic modeling I can run batch recommendAll and then
> keep serving the model using a distributed cache but then I can't
> incorporate per user model re-predict if user feedback is making the
> current topk stale. I have to wait for next batch refresh which might be 1
> hr away.
>
> spark job server + spark sql can get me fresh updates but each time
> running a predict might be slow.
>
> I am guessing the better idea might be to start with batch recommendAll
> and then update the per user model if it get stale but that needs acess to
> the key value store and the model over a API like spark job server. I am
> running experiments with job server. In general it will be nice if my key
> value store and model are both managed by same akka based API.
>
> Yes sparksql is to filter/boost recommendation results using business
> logic like user demography for example..
> On Jun 23, 2015 2:07 AM, "Sean Owen" <so...@cloudera.com> wrote:
>
>> Yes, and typically needs are <100ms. Now imagine even 10 concurrent
>> requests. My experience has been that this approach won't nearly
>> scale. The best you could probably do is async mini-batch
>> near-real-time scoring, pushing results to some store for retrieval,
>> which could be entirely suitable for your use case.
>>
>> On Tue, Jun 23, 2015 at 8:52 AM, Nick Pentreath
>> <nick.pentre...@gmail.com> wrote:
>> > If your recommendation needs are real-time (<1s) I am not sure job
>> server
>> > and computing the refs with spark will do the trick (though those new
>> > BLAS-based methods may have given sufficient speed up).
>>
>

Re: Velox Model Server

Reply via email to