I think it's far from complete or done.

I think it would be interesting to take any of the MapReduce-based jobs, set
it up, run it, and benchmark/profile it to locate some bottlenecks, then
propose optimizations. It is a good way to get familiar with the packages.

You might also investigate suggested settings for Hadoop when running these
jobs.

These are just one type of way you could contribute. Looking into open
issues in JIRA, or adding unit tests, would be fine too.

On Thu, Jan 20, 2011 at 3:36 AM, Kasun Lakpriya
<kasun.lakpriy...@gmail.com>wrote:

> Hi Sean,
> Thanks for the immediate reply and sorry for my late response.
>
> Our above mentioned project is in progress.
>
> BTW I realized that Mahout is quite interesting and very active project. I
> am just interested about contributing to Mahout. As understanding the
> complete code base is not an easy task I would like to start from some
> basic
> point. After getting familiar with the code base I can think of your
> suggestion about "improving its speed or reducing its memory/disk usage".
>
> So that what would be a good starting point?
>
> Thank you,
> Kasun
>
> On Thu, Dec 30, 2010 at 5:56 PM, Sean Owen <sro...@gmail.com> wrote:
>
> > Hi Kasun,
> >
> > If you want to get involved, you are free to discuss and propose your own
> > changes and algorithms. You can review the list of open issues here:
> > https://issues.apache.org/jira/browse/MAHOUT This contains some ideas
> > about
> > work that needs to be done.
> >
> > One interesting project would be to benchmark the existing distributed
> > item-based recommender and find ways to improve its speed or reduce its
> > memory/disk usage. That's a fairly simple starter project and quite
> useful.
> >
> > Sean
> >
> > On Wed, Dec 29, 2010 at 10:51 AM, Kasun Lakpriya <
> > kasun.lakpriy...@gmail.com
> > > wrote:
> >
> > > Hi all,
> > > I am Kasun Lakpriya from University of Moratuwa, Sri Lanka. I am
> > following
> > > a
> > > BSc in Computer Science and Engineering degree and now I am in my final
> > > year.
> > >
> > > In our degree program in order to complete the degree we need to do
> some
> > > kind of a research project approved by the university. The project I am
> > > working on is about "Web Personalization". The task is to develop a
> > > personalization module which is pluggable to any (theoretically) web
> > > application. After some literature survey we found out that there are
> > some
> > > existing open source tools we can use to implement this module
> > > (personalization module). Specially what we are focusing on is
> > > Collaborative
> > > Filtering. I have already checked out the mahout trunk and
> > > built successfully and tried this example I found on the web [1]. And I
> > > went
> > > through the wiki page related to Algorithms and found some nice
> > > presentation
> > > about "Distributed item based collaborative filtering" by Sebastian
> > > Schelter. And I went through some similarity measure implementations in
> > > Mahout.
> > >
> > > What I want from you all is some guidance and helping hand to start
> > > implementation on improving an algorithm already there in the Mahout or
> > > what
> > > are the other areas we can integrated to Mahout regarding to
> > Collaborative
> > > Filtering. In the recent mail archives I couldn't find such a
> discussion
> > > regarding this thing. Any further reading or references would be
> > > really appreciated.
> > >
> > >
> > > Thanks and Regards,
> > > Kasun
> > >
> > > [1] -
> > >
> > >
> >
> http://philippeadjiman.com/blog/2009/11/11/flexible-collaborative-filtering-in-java-with-
> > > mahout-taste/
> > >
> >
>

Reply via email to