Re: Is it possible to do incremental training using ALSModel (MLlib)?

Wouter Samaey Wed, 07 Jan 2015 02:13:09 -0800

You’re right, Nick! This function does exactly that.

Sean has already helped me greatly.


Thanks for your reply.

--------
Wouter Samaey
Zaakvoerder Storefront BVBA

Tel: +32 472 72 83 07
Web: http://storefront.be

LinkedIn: http://www.linkedin.com/in/woutersamaey

> On 07 Jan 2015, at 11:08, Nick Pentreath <nick.pentre...@gmail.com> wrote:
> 
> As I recall Oryx (the old version, and I assume the new one too) provide 
> something like this:
> http://cloudera.github.io/oryx/apidocs/com/cloudera/oryx/als/common/OryxRecommender.html#recommendToAnonymous-java.lang.String:A-float:A-int-
>  
> <http://cloudera.github.io/oryx/apidocs/com/cloudera/oryx/als/common/OryxRecommender.html#recommendToAnonymous-java.lang.String:A-float:A-int->
> 
> though Sean will be more on top of that than me :)
> 
> On Mon, Jan 5, 2015 at 2:17 PM, Wouter Samaey <wouter.sam...@storefront.be 
> <mailto:wouter.sam...@storefront.be>> wrote:
> One other idea was that I don’t need to re-train the model, but simply pass 
> all the current user’s recent ratings (including one’s created after the 
> training) to the existing model…
> 
> Is this a valid option?
> 
> 
> --------
> Wouter Samaey
> Zaakvoerder Storefront BVBA
> 
> Tel: +32 472 72 83 07 <tel:%2B32%20472%2072%2083%2007>
> Web: http://storefront.be <http://storefront.be/>
> 
> LinkedIn: http://www.linkedin.com/in/woutersamaey 
> <http://www.linkedin.com/in/woutersamaey>
> 
> > On 05 Jan 2015, at 13:13, Sean Owen <so...@cloudera.com 
> > <mailto:so...@cloudera.com>> wrote:
> >
> > In the first instance, I'm suggesting that ALS in Spark could perhaps
> > expose a run() method that accepts a previous
> > MatrixFactorizationModel, and uses the product factors from it as the
> > initial state instead. If anybody seconds that idea, I'll make a PR.
> >
> > The second idea is just fold-in:
> > http://www.slideshare.net/srowen/big-practical-recommendations-with-alternating-least-squares/14
> >  
> > <http://www.slideshare.net/srowen/big-practical-recommendations-with-alternating-least-squares/14>
> >
> > Whether you do this or something like SGD, inside or outside Spark,
> > depends on your requirements I think.
> >
> > On Sat, Jan 3, 2015 at 12:04 PM, Wouter Samaey
> > <wouter.sam...@storefront.be <mailto:wouter.sam...@storefront.be>> wrote:
> >> Do you know a place where I could find a sample or tutorial for this?
> >>
> >> I'm still very new at this. And struggling a bit...
> >>
> >> Thanks in advance
> >>
> >> Wouter
> >>
> >> Sent from my iPhone.
> >>
> >> On 03 Jan 2015, at 10:36, Sean Owen <so...@cloudera.com 
> >> <mailto:so...@cloudera.com>> wrote:
> >>
> >> Yes, it is easy to simply start a new factorization from the current model
> >> solution. It works well. That's more like incremental *batch* rebuilding of
> >> the model. That is not in MLlib but fairly trivial to add.
> >>
> >> You can certainly 'fold in' new data to approximately update with one new
> >> datum too, which you can find online. This is not quite the same idea as
> >> streaming SGD. I'm not sure this fits the RDD model well since it entails
> >> updating one element at a time but mini batch could be reasonable.
> >>
> >> On Jan 3, 2015 5:29 AM, "Peng Cheng" <rhw...@gmail.com 
> >> <mailto:rhw...@gmail.com>> wrote:
> >>>
> >>> I was under the impression that ALS wasn't designed for it :-< The famous
> >>> ebay online recommender uses SGD
> >>> However, you can try using the previous model as starting point, and
> >>> gradually reduce the number of iteration after the model stablize. I never
> >>> verify this idea, so you need to at least cross-validate it before putting
> >>> into productio
> >>>
> >>> On 2 January 2015 at 04:40, Wouter Samaey <wouter.sam...@storefront.be 
> >>> <mailto:wouter.sam...@storefront.be>>
> >>> wrote:
> >>>>
> >>>> Hi all,
> >>>>
> >>>> I'm curious about MLlib and if it is possible to do incremental training
> >>>> on
> >>>> the ALSModel.
> >>>>
> >>>> Usually training is run first, and then you can query. But in my case,
> >>>> data
> >>>> is collected in real-time and I want the predictions of my ALSModel to
> >>>> consider the latest data without complete re-training phase.
> >>>>
> >>>> I've checked out these resources, but could not find any info on how to
> >>>> solve this:
> >>>> https://spark.apache.org/docs/latest/mllib-collaborative-filtering.html 
> >>>> <https://spark.apache.org/docs/latest/mllib-collaborative-filtering.html>
> >>>>
> >>>> http://ampcamp.berkeley.edu/big-data-mini-course/movie-recommendation-with-mllib.html
> >>>>  
> >>>> <http://ampcamp.berkeley.edu/big-data-mini-course/movie-recommendation-with-mllib.html>
> >>>>
> >>>> My question fits in a larger picture where I'm using Prediction IO, and
> >>>> this
> >>>> in turn is based on Spark.
> >>>>
> >>>> Thanks in advance for any advice!
> >>>>
> >>>> Wouter
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> View this message in context:
> >>>> http://apache-spark-user-list.1001560.n3.nabble.com/Is-it-possible-to-do-incremental-training-using-ALSModel-MLlib-tp20942.html
> >>>>  
> >>>> <http://apache-spark-user-list.1001560.n3.nabble.com/Is-it-possible-to-do-incremental-training-using-ALSModel-MLlib-tp20942.html>
> >>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org 
> >>>> <mailto:user-unsubscr...@spark.apache.org>
> >>>> For additional commands, e-mail: user-h...@spark.apache.org 
> >>>> <mailto:user-h...@spark.apache.org>
> >>>>
> >>>
> >>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org 
> <mailto:user-unsubscr...@spark.apache.org>
> For additional commands, e-mail: user-h...@spark.apache.org 
> <mailto:user-h...@spark.apache.org>
> 
>

Re: Is it possible to do incremental training using ALSModel (MLlib)?

Reply via email to