This is being discussed in
https://issues.apache.org/jira/browse/SPARK-6407. Let's move the
discussion there. Thanks for providing references! -Xiangrui

On Sun, Apr 5, 2015 at 11:48 PM, Chunnan Yao <yaochun...@gmail.com> wrote:
> On-line Collaborative Filtering(CF) has been widely used and studied. To
> re-train a CF model from scratch every time when new data comes in is very
> inefficient
> (http://stackoverflow.com/questions/27734329/apache-spark-incremental-training-of-als-model).
> However, in Spark community we see few discussion about collaborative
> filtering on streaming data. Given streaming k-means, streaming logistic
> regression, and the on-going incremental model training of Naive Bayes
> Classifier (SPARK-4144), we think it is meaningful to consider streaming
> Collaborative Filtering support on MLlib.
>
> I've created an issue on JIRA (SPARK-6711) for possible discussions. We
> suggest to refer to this paper
> (https://www.cs.utexas.edu/~cjohnson/ParallelCollabFilt.pdf). It is based on
> SGD instead of ALS, which is easier to be tackled under streaming data.
>
> Fortunately, the authors of this paper have implemented their algorithm as a
> Github Project, based on Storm:
> https://github.com/MrChrisJohnson/CollabStream
>
> Please don't hesitate to give your opinions on this issue and our planned
> approach. We'd like to work on this in the next few weeks.
>
>
>
> -----
> Feel the sparking Spark!
> --
> View this message in context: 
> http://apache-spark-developers-list.1001551.n3.nabble.com/Support-parallelized-online-matrix-factorization-for-Collaborative-Filtering-tp11413.html
> Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to