You’re right, Nick! This function does exactly that. Sean has already helped me greatly.
Thanks for your reply. -------- Wouter Samaey Zaakvoerder Storefront BVBA Tel: +32 472 72 83 07 Web: http://storefront.be LinkedIn: http://www.linkedin.com/in/woutersamaey > On 07 Jan 2015, at 11:08, Nick Pentreath <nick.pentre...@gmail.com> wrote: > > As I recall Oryx (the old version, and I assume the new one too) provide > something like this: > http://cloudera.github.io/oryx/apidocs/com/cloudera/oryx/als/common/OryxRecommender.html#recommendToAnonymous-java.lang.String:A-float:A-int- > > <http://cloudera.github.io/oryx/apidocs/com/cloudera/oryx/als/common/OryxRecommender.html#recommendToAnonymous-java.lang.String:A-float:A-int-> > > though Sean will be more on top of that than me :) > > On Mon, Jan 5, 2015 at 2:17 PM, Wouter Samaey <wouter.sam...@storefront.be > <mailto:wouter.sam...@storefront.be>> wrote: > One other idea was that I don’t need to re-train the model, but simply pass > all the current user’s recent ratings (including one’s created after the > training) to the existing model… > > Is this a valid option? > > > -------- > Wouter Samaey > Zaakvoerder Storefront BVBA > > Tel: +32 472 72 83 07 <tel:%2B32%20472%2072%2083%2007> > Web: http://storefront.be <http://storefront.be/> > > LinkedIn: http://www.linkedin.com/in/woutersamaey > <http://www.linkedin.com/in/woutersamaey> > > > On 05 Jan 2015, at 13:13, Sean Owen <so...@cloudera.com > > <mailto:so...@cloudera.com>> wrote: > > > > In the first instance, I'm suggesting that ALS in Spark could perhaps > > expose a run() method that accepts a previous > > MatrixFactorizationModel, and uses the product factors from it as the > > initial state instead. If anybody seconds that idea, I'll make a PR. > > > > The second idea is just fold-in: > > http://www.slideshare.net/srowen/big-practical-recommendations-with-alternating-least-squares/14 > > > > <http://www.slideshare.net/srowen/big-practical-recommendations-with-alternating-least-squares/14> > > > > Whether you do this or something like SGD, inside or outside Spark, > > depends on your requirements I think. > > > > On Sat, Jan 3, 2015 at 12:04 PM, Wouter Samaey > > <wouter.sam...@storefront.be <mailto:wouter.sam...@storefront.be>> wrote: > >> Do you know a place where I could find a sample or tutorial for this? > >> > >> I'm still very new at this. And struggling a bit... > >> > >> Thanks in advance > >> > >> Wouter > >> > >> Sent from my iPhone. > >> > >> On 03 Jan 2015, at 10:36, Sean Owen <so...@cloudera.com > >> <mailto:so...@cloudera.com>> wrote: > >> > >> Yes, it is easy to simply start a new factorization from the current model > >> solution. It works well. That's more like incremental *batch* rebuilding of > >> the model. That is not in MLlib but fairly trivial to add. > >> > >> You can certainly 'fold in' new data to approximately update with one new > >> datum too, which you can find online. This is not quite the same idea as > >> streaming SGD. I'm not sure this fits the RDD model well since it entails > >> updating one element at a time but mini batch could be reasonable. > >> > >> On Jan 3, 2015 5:29 AM, "Peng Cheng" <rhw...@gmail.com > >> <mailto:rhw...@gmail.com>> wrote: > >>> > >>> I was under the impression that ALS wasn't designed for it :-< The famous > >>> ebay online recommender uses SGD > >>> However, you can try using the previous model as starting point, and > >>> gradually reduce the number of iteration after the model stablize. I never > >>> verify this idea, so you need to at least cross-validate it before putting > >>> into productio > >>> > >>> On 2 January 2015 at 04:40, Wouter Samaey <wouter.sam...@storefront.be > >>> <mailto:wouter.sam...@storefront.be>> > >>> wrote: > >>>> > >>>> Hi all, > >>>> > >>>> I'm curious about MLlib and if it is possible to do incremental training > >>>> on > >>>> the ALSModel. > >>>> > >>>> Usually training is run first, and then you can query. But in my case, > >>>> data > >>>> is collected in real-time and I want the predictions of my ALSModel to > >>>> consider the latest data without complete re-training phase. > >>>> > >>>> I've checked out these resources, but could not find any info on how to > >>>> solve this: > >>>> https://spark.apache.org/docs/latest/mllib-collaborative-filtering.html > >>>> <https://spark.apache.org/docs/latest/mllib-collaborative-filtering.html> > >>>> > >>>> http://ampcamp.berkeley.edu/big-data-mini-course/movie-recommendation-with-mllib.html > >>>> > >>>> <http://ampcamp.berkeley.edu/big-data-mini-course/movie-recommendation-with-mllib.html> > >>>> > >>>> My question fits in a larger picture where I'm using Prediction IO, and > >>>> this > >>>> in turn is based on Spark. > >>>> > >>>> Thanks in advance for any advice! > >>>> > >>>> Wouter > >>>> > >>>> > >>>> > >>>> -- > >>>> View this message in context: > >>>> http://apache-spark-user-list.1001560.n3.nabble.com/Is-it-possible-to-do-incremental-training-using-ALSModel-MLlib-tp20942.html > >>>> > >>>> <http://apache-spark-user-list.1001560.n3.nabble.com/Is-it-possible-to-do-incremental-training-using-ALSModel-MLlib-tp20942.html> > >>>> Sent from the Apache Spark User List mailing list archive at Nabble.com. > >>>> > >>>> --------------------------------------------------------------------- > >>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > >>>> <mailto:user-unsubscr...@spark.apache.org> > >>>> For additional commands, e-mail: user-h...@spark.apache.org > >>>> <mailto:user-h...@spark.apache.org> > >>>> > >>> > >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > <mailto:user-unsubscr...@spark.apache.org> > For additional commands, e-mail: user-h...@spark.apache.org > <mailto:user-h...@spark.apache.org> > >