Ted, How about a *MatrixSuperView implements Matrix*? (A MatrixView like implementation)
On Fri, Apr 12, 2013 at 2:28 AM, Gokhan Capan <gkhn...@gmail.com> wrote: > So if I understood correctly, the algorithm still runs on matrix, and a > client still can pass a group of matrices. > > Again it came to data preparation:) > > I will refactor the implementation to run on single matrix, but provide > tools for turning the obvious client data into actual input to the > algorithm. > > Sent from my iPhone > > On Apr 12, 2013, at 1:13, Ted Dunning <ted.dunn...@gmail.com> wrote: > > One easy thing to do is to build an adjoined matrix type that does the > concatenation on the fly. > > > > > On Thu, Apr 11, 2013 at 1:43 PM, Gokhan Capan <gkhn...@gmail.com> wrote: > >> Yeah, it is simpler indeed. >> >> I am going to think about alternative ways to make concatenation easier >> for clients. >> >> Thanks for your review >> >> >> On Thu, Apr 11, 2013 at 10:45 PM, Robin Anil <robin.a...@gmail.com>wrote: >> >>> I would have folded them all as different feature ids in a single >>> vector, makes things a lot simpler and faster. >>> >>> Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc. >>> >>> >>> On Thu, Apr 11, 2013 at 11:19 AM, Gokhan Capan <gkhn...@gmail.com>wrote: >>> >>>> Hi Robin, >>>> >>>> If you are asking why they are arrays, it is because to save clients >>>> from concatenating multiple matrices to create the input. >>>> >>>> I am quoting from libFM >>>> paper<http://www.csie.ntu.edu.tw/~b97053/paper/Factorization%20Machines%20with%20libFM.pdf>: >>>> "For easier interpretation, >>>> the features are grouped into indicators for the active user (blue), >>>> active item (red), other movies rated >>>> by the same user (orange), the time in months (green), and the last >>>> movie rated (brown)." >>>> >>>> I thought a client would create multiple group of matrices, and he can >>>> just pass them all to the algorithm. >>>> >>>> Then the wModel is w parameters, it is still array of vectors for me to >>>> keep the indexing consistent, and vModel is the V parameters. >>>> >>>> Was that what you were asking? >>>> >>>> >>>> On Thu, Apr 11, 2013 at 6:44 PM, Robin Anil <robin.a...@gmail.com>wrote: >>>> >>>>> Comments away. I was a bit confused by the use of Vector[] for w1 and >>>>> Matrix[] for inputs. >>>>> >>>>> Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc. >>>>> >>>>> >>>>> On Thu, Apr 11, 2013 at 10:00 AM, Gokhan Capan <gkhn...@gmail.com>wrote: >>>>> >>>>>> Ted, >>>>>> Robin, >>>>>> >>>>>> Although I did not test on a dataset yet, recently I've been >>>>>> implementing Factorization Machines with SGD optimization. >>>>>> >>>>>> The initial implementation is at >>>>>> https://github.com/gcapan/mahout/tree/fm >>>>>> >>>>>> Would you guys consider to take a look so I can make it better and >>>>>> running? >>>>>> >>>>>> >>>>>> >>>>>> On Mon, Apr 1, 2013 at 8:45 PM, Nkechi Nnadi >>>>>> <nkechi.nn...@gmail.com>wrote: >>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> I'm long time lurker. I would be interested in implementing these. >>>>>>> I >>>>>>> thought I would get my feet wet with contributing to wiki with >>>>>>> tutorials >>>>>>> since I have used Mahout for recommendation and clustering in my >>>>>>> dissertation. I have never contributed code before and I would love >>>>>>> to >>>>>>> start now. >>>>>>> >>>>>>> -Nkechi >>>>>>> >>>>>>> >>>>>>> On Sun, Mar 31, 2013 at 1:14 PM, Robin Anil <robin.a...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>> > FMs work really well for a whole range of things. Having >>>>>>> implemented them >>>>>>> > myself, I can extend my services as a reviewer if anyone is >>>>>>> willing to >>>>>>> > start on it. >>>>>>> > >>>>>>> > Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc. >>>>>>> > >>>>>>> > >>>>>>> > On Sun, Mar 31, 2013 at 2:18 AM, Ted Dunning < >>>>>>> ted.dunn...@gmail.com> >>>>>>> > wrote: >>>>>>> > >>>>>>> > > Relative to Dan's recent mention of SOM as possible new project, >>>>>>> here are >>>>>>> > > slides from KDD Cup 2012 in which Stephen Rendle describes how >>>>>>> he did >>>>>>> > using >>>>>>> > > a very straightforward implementation of Factorization Machines >>>>>>> [1,2]. >>>>>>> > > >>>>>>> > > >>>>>>> > > FMs are interesting in the context of Mahout because they can be >>>>>>> used in >>>>>>> > a >>>>>>> > > wide variety of settings including recommendation and targeting >>>>>>> and >>>>>>> > because >>>>>>> > > they have very good performance on a number of tasks. >>>>>>> > > >>>>>>> > > I should mention that Robin was the one who first mentioned FMs >>>>>>> to me. >>>>>>> > > >>>>>>> > > The KDD 2012 competition [3] is of interest in any case because >>>>>>> it >>>>>>> > provides >>>>>>> > > a large amount of realistic data for commercially important >>>>>>> problems. >>>>>>> > > >>>>>>> > > [1] >>>>>>> > > >>>>>>> > > >>>>>>> > >>>>>>> https://kaggle2.blob.core.windows.net/competitions/kddcup2012/2748/media/RendleSlides.pdf >>>>>>> > > >>>>>>> > > [2] >>>>>>> > > >>>>>>> > > >>>>>>> > >>>>>>> https://kaggle2.blob.core.windows.net/competitions/kddcup2012/2748/media/Rendle.pdf >>>>>>> > > >>>>>>> > > [3] http://www.kddcup2012.org/ >>>>>>> > > >>>>>>> > >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Gokhan >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Gokhan >>>> >>> >>> >> >> >> -- >> Gokhan >> > > -- Gokhan