Re: Mahout GSoC 2010: Association Mining

2010-04-09 Thread Ted Dunning
Neal, I think that this might well be a useful contribution to Mahout, but, if I am not mistaken, I think that the deadline for student proposals for GSoC has just passed. That likely means that making this contribution an official GSoC project is not possible. I am sure that the Mahout community

Mahout GSoC 2010: Association Mining

2010-04-09 Thread Neal Clark
Hello, I just wanted to introduce myself. I am a MSc. Computer Science student at the University of Victoria. My research over the past year has been focused on developing and implementing an Apriori based frequent item-set mining algorithm for mining large data sets at low support counts. https:

Re: Mahout GSoC 2010 proposal: Association Mining

2010-04-09 Thread Lukáš Vlček
Ted, do you think you can give some good links to paper or orther resources about mentioned approaches? I would like to look at it after the weekend. As far as I can see the association mining (and the guha method in its original form) is not meant to be a predictive method but rather data explora

[jira] Updated: (MAHOUT-371) [GSoC] Proposal to implement Distributed SVD++ Recommender using Hadoop

2010-04-09 Thread Richard Simon Just (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Richard Simon Just updated MAHOUT-371: -- Description: Proposal Title: [MAHOUT-371] Proposal to implement Distributed SVD++ Reco

Re: Mahout GSoC 2010 proposal: Association Mining

2010-04-09 Thread Ted Dunning
Lukas, The strongest alternative for this kind of application (and the normal choice for large scale applications) is on-line gradient descent learning with an L_1 or L_1 + L_2 regularization. The typical goal is to predict some outcome (click or purchase or signup) from a variety of large vocabu

[jira] Created: (MAHOUT-374) GSOC 2010 Proposal Implement Map/Reduce Enabled Neural Networks (mahout-342)

2010-04-09 Thread Yinghua Hu (JIRA)
GSOC 2010 Proposal Implement Map/Reduce Enabled Neural Networks (mahout-342) - Key: MAHOUT-374 URL: https://issues.apache.org/jira/browse/MAHOUT-374 Project: Mahout

GSOC Create Sql adapters proposal

2010-04-09 Thread Necati Batur
Hi, Create adapters for MYSQL and NOSQL(hbase, cassandra) to access data for all the algorithms to use; Necati Batur ; necatiba...@gmail.com Mahout / Mahout - 332 : Assigned Mentor is Robin Anil Proposal Abstract: It would be useful to use thrift as the protocol with the noSQL systems, as

[jira] Created: (MAHOUT-373) VectorDumper/VectorHelper doesn't dump values when dictionary is present

2010-04-09 Thread Drew Farris (JIRA)
VectorDumper/VectorHelper doesn't dump values when dictionary is present Key: MAHOUT-373 URL: https://issues.apache.org/jira/browse/MAHOUT-373 Project: Mahout Issue Typ

[jira] Commented: (MAHOUT-372) Partitioning Collaborative Filtering Job into Maps and Reduces

2010-04-09 Thread Kris Jack (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855381#action_12855381 ] Kris Jack commented on MAHOUT-372: -- Thanks for your reply. I'll run it using the command

[jira] Resolved: (MAHOUT-372) Partitioning Collaborative Filtering Job into Maps and Reduces

2010-04-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved MAHOUT-372. -- Resolution: Fixed Fix Version/s: 0.4 Assignee: Sean Owen Yes, sure there's no particula

Re: Mahout GSoC 2010 proposal: Association Mining

2010-04-09 Thread Robin Anil
Hi Lukáš, It would have been great if you could have participated in GSOC, there is time left. But you still have your proposal in the GSOC system. Take your time to decide, but if you choose not participate to do remove the application from the soc website. Wiki page for association m

Re: Mahout GSoC 2010 proposal: Association Mining

2010-04-09 Thread Lukáš Vlček
Robin, I think it does not make sense for me to catch with GSoC timeline now as I am quite busy with other stuff. However, I will develop the proposal for Association Mining (or GUHA if you like) and keep this discussion going on. I am really interested in contributing some implementation to Mahou

[jira] Created: (MAHOUT-372) Partitioning Collaborative Filtering Job into Maps and Reduces

2010-04-09 Thread Kris Jack (JIRA)
Partitioning Collaborative Filtering Job into Maps and Reduces -- Key: MAHOUT-372 URL: https://issues.apache.org/jira/browse/MAHOUT-372 Project: Mahout Issue Type: Question

Re: [GSOC] 2010 Timelines

2010-04-09 Thread Isabel Drost
Timeline including Apache internal deadlines: http://cwiki.apache.org/confluence/display/COMDEVxSITE/GSoC Mentors, please also click on the ranking link to the ranking explanation [1] for more information on how to rank student proposals. Isabel [1] http://cwiki.apache.org/confluence/display