Thanks Sean and Sebastian. Yes, it's still far away, just finished documentation stuff.
I will go though these stuff (Thanks for the links Sebastian) and try to get familiar with Mahout. After that I can go in to your suggestions one by one. On Thu, Jan 20, 2011 at 1:46 PM, Sebastian Schelter <s...@apache.org> wrote: > I'd be very interested in benchmark data for and/or performance increases > of RecommenderJob (as well as ItemSimilarityJob and RowSimilarityJob which > are used internally), if you feel like working on that. > > A good starting point to get familiar with the functionality might be > Sean's talk from Berlin Buzzwords ( > http://berlinbuzzwords.blip.tv/file/3811036/ ) and my slides from Berlin's > last Hadoop Get Together ( http://www.slideshare.net/sscdotopen/mahoutcf ) > > --sebastian > > > On 20.01.2011 09:08, Sean Owen wrote: > >> I think it's far from complete or done. >> >> I think it would be interesting to take any of the MapReduce-based jobs, >> set >> it up, run it, and benchmark/profile it to locate some bottlenecks, then >> propose optimizations. It is a good way to get familiar with the packages. >> >> You might also investigate suggested settings for Hadoop when running >> these >> jobs. >> >> These are just one type of way you could contribute. Looking into open >> issues in JIRA, or adding unit tests, would be fine too. >> >> On Thu, Jan 20, 2011 at 3:36 AM, Kasun Lakpriya >> <kasun.lakpriy...@gmail.com>wrote: >> >> Hi Sean, >>> Thanks for the immediate reply and sorry for my late response. >>> >>> Our above mentioned project is in progress. >>> >>> BTW I realized that Mahout is quite interesting and very active project. >>> I >>> am just interested about contributing to Mahout. As understanding the >>> complete code base is not an easy task I would like to start from some >>> basic >>> point. After getting familiar with the code base I can think of your >>> suggestion about "improving its speed or reducing its memory/disk usage". >>> >>> So that what would be a good starting point? >>> >>> Thank you, >>> Kasun >>> >>> On Thu, Dec 30, 2010 at 5:56 PM, Sean Owen<sro...@gmail.com> wrote: >>> >>> Hi Kasun, >>>> >>>> If you want to get involved, you are free to discuss and propose your >>>> own >>>> changes and algorithms. You can review the list of open issues here: >>>> https://issues.apache.org/jira/browse/MAHOUT This contains some ideas >>>> about >>>> work that needs to be done. >>>> >>>> One interesting project would be to benchmark the existing distributed >>>> item-based recommender and find ways to improve its speed or reduce its >>>> memory/disk usage. That's a fairly simple starter project and quite >>>> >>> useful. >>> >>>> Sean >>>> >>>> On Wed, Dec 29, 2010 at 10:51 AM, Kasun Lakpriya< >>>> kasun.lakpriy...@gmail.com >>>> >>>>> wrote: >>>>> Hi all, >>>>> I am Kasun Lakpriya from University of Moratuwa, Sri Lanka. I am >>>>> >>>> following >>>> >>>>> a >>>>> BSc in Computer Science and Engineering degree and now I am in my final >>>>> year. >>>>> >>>>> In our degree program in order to complete the degree we need to do >>>>> >>>> some >>> >>>> kind of a research project approved by the university. The project I am >>>>> working on is about "Web Personalization". The task is to develop a >>>>> personalization module which is pluggable to any (theoretically) web >>>>> application. After some literature survey we found out that there are >>>>> >>>> some >>>> >>>>> existing open source tools we can use to implement this module >>>>> (personalization module). Specially what we are focusing on is >>>>> Collaborative >>>>> Filtering. I have already checked out the mahout trunk and >>>>> built successfully and tried this example I found on the web [1]. And I >>>>> went >>>>> through the wiki page related to Algorithms and found some nice >>>>> presentation >>>>> about "Distributed item based collaborative filtering" by Sebastian >>>>> Schelter. And I went through some similarity measure implementations in >>>>> Mahout. >>>>> >>>>> What I want from you all is some guidance and helping hand to start >>>>> implementation on improving an algorithm already there in the Mahout or >>>>> what >>>>> are the other areas we can integrated to Mahout regarding to >>>>> >>>> Collaborative >>>> >>>>> Filtering. In the recent mail archives I couldn't find such a >>>>> >>>> discussion >>> >>>> regarding this thing. Any further reading or references would be >>>>> really appreciated. >>>>> >>>>> >>>>> Thanks and Regards, >>>>> Kasun >>>>> >>>>> [1] - >>>>> >>>>> >>>>> >>> http://philippeadjiman.com/blog/2009/11/11/flexible-collaborative-filtering-in-java-with- >>> >>>> mahout-taste/ >>>>> >>>>> >