I haven't really started any coding for the integration but was planning to this week. If a GSOC student is interested in taking over, I'll be happy to help.
We already have NB, LDA and SVD, so instead of coming up with yet another probabilistic model, a good add would be taking the existing fully distributed LDA and SVD implementations in Mahout and applying them in recommendations IMHO. A solid fully distributed implementation of Restricted Boltzman's Machines (RBM) would make for superb GSOC project and will be quite challenging. -...@nkur 3/19/10 5:50 PM, "Sean Owen" <sro...@gmail.com> wrote: +mahout-user >From a recommender perspective I can think of three worthwhile projects: 1. Combine the two co-occurrence-based distributed recommenders in the code now. They take slightly different approaches. Ankur's working on this but might give it over to a GSoC student. This is probably 1/2 the size of a proper GSoC project. 2. Add a fully distributed slope-one recommender. Part of the computation is already distributed. Efficiently distributing the rest is interesting. Also not so hard: I'd judge this is 1/2 a GSoC project. 3. Implement a probabilistic model-based recommender of any kind, distributed or non-distributed. This is probably a whole GSoC project. On Fri, Mar 19, 2010 at 11:45 AM, RSJ <i...@richardsimonjust.co.uk> wrote: > Hey there, > > My name is Richard Just, I'm a final year BSc Applied Computer Science > student at Reading University, UK, with a strong focus on programming. > I'm just finishing up a term that included modules in Distributed > Computing and Evolutionary Computation, which have been the greatest > modules of my uni career by far. Between that, my love for open source > and having read about the ASF, I'm really interested in taking part in > GSoC with an ASF project, namely Mahout. I'm really taken by the ethos > behind the ASF as a whole and I'm hoping that taking part in GSoC will > be the start of my long term involvement with ASF projects. > > My main programming background is Java, and I did a 9 month placement > programming in it for a non-profit organisation last year. From that > placement I gained a love and appreciation for well commented, well > documented code, while from my time at university I now have a passion > for well designed code and the time it saves. > > With GSoC, I've read through the suggested Mahout projects so far, and I > think implementing an algorithm is probably my best bet. I say that > because I don't have much Mahout experience yet, but through multiple > University modules I do have experience designing and implementing > algorithms. With that in mind and given that there is already a > Classifier proposal, I was thinking either a Cluster or Recommendation > algorithm. > > I'd be very interested in hearing if there are any particular Clustering > algorithms or particular elements of the top Netflix team solutions > people would like to see implemented? > > Many thanks for reading this > RSJ >