Re: ALS update without re-computing everything

2016-03-11 Thread Nick Pentreath
There is a general movement to allowing initial models to be specified for Spark ML algorithms, so I'll add a JIRA to that task set. I should be able to work on this as well as other ALS improvements. Oh, another reason fold-in is typically not done in Spark is that for models of any reasonable si

Re: ALS update without re-computing everything

2016-03-11 Thread Sean Owen
On Fri, Mar 11, 2016 at 12:18 PM, Nick Pentreath wrote: > In general, for serving situations MF models are stored in some other > serving system, so that system may be better suited to do the actual > fold-in. Sean's Oryx project does that, though I'm not sure offhand if that > part is done in Spa

Re: ALS update without re-computing everything

2016-03-11 Thread Nick Pentreath
Currently this is not supported. If you want to do incremental fold-in of new data you would need to do it outside of Spark (e.g. see this discussion: https://mail-archives.apache.org/mod_mbox/spark-user/201603.mbox/browser, which also mentions a streaming on-line MF implementation with SGD). In g

ALS update without re-computing everything

2016-03-11 Thread Roberto Pagliari
In the current implementation of ALS with implicit feedback, when new date come in, it is not possible to update user/product matrices without re-computing everything. Is this feature in planning or any known work around? Thank you,