Re: Regarding Collaborative Filtering.

Kasun Lakpriya Thu, 20 Jan 2011 09:58:55 -0800

Thanks Sean and Sebastian.

Yes, it's still far away, just finished documentation stuff.


I will go though these stuff (Thanks for the links Sebastian) and try to get
familiar with Mahout. After that I can go in to your suggestions one by
one.

On Thu, Jan 20, 2011 at 1:46 PM, Sebastian Schelter <s...@apache.org> wrote:

> I'd be very interested in benchmark data for and/or performance increases
> of RecommenderJob (as well as ItemSimilarityJob and RowSimilarityJob which
> are used internally), if you feel like working on that.
>
> A good starting point to get familiar with the functionality might be
> Sean's talk from Berlin Buzzwords (
> http://berlinbuzzwords.blip.tv/file/3811036/ ) and my slides from Berlin's
> last Hadoop Get Together ( http://www.slideshare.net/sscdotopen/mahoutcf )
>
> --sebastian
>
>
> On 20.01.2011 09:08, Sean Owen wrote:
>
>> I think it's far from complete or done.
>>
>> I think it would be interesting to take any of the MapReduce-based jobs,
>> set
>> it up, run it, and benchmark/profile it to locate some bottlenecks, then
>> propose optimizations. It is a good way to get familiar with the packages.
>>
>> You might also investigate suggested settings for Hadoop when running
>> these
>> jobs.
>>
>> These are just one type of way you could contribute. Looking into open
>> issues in JIRA, or adding unit tests, would be fine too.
>>
>> On Thu, Jan 20, 2011 at 3:36 AM, Kasun Lakpriya
>> <kasun.lakpriy...@gmail.com>wrote:
>>
>>  Hi Sean,
>>> Thanks for the immediate reply and sorry for my late response.
>>>
>>> Our above mentioned project is in progress.
>>>
>>> BTW I realized that Mahout is quite interesting and very active project.
>>> I
>>> am just interested about contributing to Mahout. As understanding the
>>> complete code base is not an easy task I would like to start from some
>>> basic
>>> point. After getting familiar with the code base I can think of your
>>> suggestion about "improving its speed or reducing its memory/disk usage".
>>>
>>> So that what would be a good starting point?
>>>
>>> Thank you,
>>> Kasun
>>>
>>> On Thu, Dec 30, 2010 at 5:56 PM, Sean Owen<sro...@gmail.com>  wrote:
>>>
>>>  Hi Kasun,
>>>>
>>>> If you want to get involved, you are free to discuss and propose your
>>>> own
>>>> changes and algorithms. You can review the list of open issues here:
>>>> https://issues.apache.org/jira/browse/MAHOUT This contains some ideas
>>>> about
>>>> work that needs to be done.
>>>>
>>>> One interesting project would be to benchmark the existing distributed
>>>> item-based recommender and find ways to improve its speed or reduce its
>>>> memory/disk usage. That's a fairly simple starter project and quite
>>>>
>>> useful.
>>>
>>>> Sean
>>>>
>>>> On Wed, Dec 29, 2010 at 10:51 AM, Kasun Lakpriya<
>>>> kasun.lakpriy...@gmail.com
>>>>
>>>>> wrote:
>>>>> Hi all,
>>>>> I am Kasun Lakpriya from University of Moratuwa, Sri Lanka. I am
>>>>>
>>>> following
>>>>
>>>>> a
>>>>> BSc in Computer Science and Engineering degree and now I am in my final
>>>>> year.
>>>>>
>>>>> In our degree program in order to complete the degree we need to do
>>>>>
>>>> some
>>>
>>>> kind of a research project approved by the university. The project I am
>>>>> working on is about "Web Personalization". The task is to develop a
>>>>> personalization module which is pluggable to any (theoretically) web
>>>>> application. After some literature survey we found out that there are
>>>>>
>>>> some
>>>>
>>>>> existing open source tools we can use to implement this module
>>>>> (personalization module). Specially what we are focusing on is
>>>>> Collaborative
>>>>> Filtering. I have already checked out the mahout trunk and
>>>>> built successfully and tried this example I found on the web [1]. And I
>>>>> went
>>>>> through the wiki page related to Algorithms and found some nice
>>>>> presentation
>>>>> about "Distributed item based collaborative filtering" by Sebastian
>>>>> Schelter. And I went through some similarity measure implementations in
>>>>> Mahout.
>>>>>
>>>>> What I want from you all is some guidance and helping hand to start
>>>>> implementation on improving an algorithm already there in the Mahout or
>>>>> what
>>>>> are the other areas we can integrated to Mahout regarding to
>>>>>
>>>> Collaborative
>>>>
>>>>> Filtering. In the recent mail archives I couldn't find such a
>>>>>
>>>> discussion
>>>
>>>> regarding this thing. Any further reading or references would be
>>>>> really appreciated.
>>>>>
>>>>>
>>>>> Thanks and Regards,
>>>>> Kasun
>>>>>
>>>>> [1] -
>>>>>
>>>>>
>>>>>
>>> http://philippeadjiman.com/blog/2009/11/11/flexible-collaborative-filtering-in-java-with-
>>>
>>>> mahout-taste/
>>>>>
>>>>>
>

Re: Regarding Collaborative Filtering.

Reply via email to