Re: [GSOC] Congrats to all students

2010-04-27 Thread Zaid Md Abdul Wahab Sheikh
Thanks. It's great to finally have the chance to be a part of Apache Mahout. Congratulations to everyone who got selected! +1 for the shared blog idea! On Tue, Apr 27, 2010 at 12:52 PM, Robin Anil wrote: > Congrats everyone.And a special thanks to Benson for helping us get the > slots to 5 t

Re: [GSOC] Congrats to all students

2010-04-27 Thread Sisir Koppaka
+1 for shared blog!

Re: [GSOC] Congrats to all students

2010-04-27 Thread zhao zhendong
Thanks everyone! I am so exciting to be accepted and I will do my best to finish my proposal in time. A shared blog sounds great to me. The GSoC looks like a training, we suppose to share the experience with all who interested in Mahout project. Cheers, Zhendong On Tue, Apr 27, 2010 at 3:22 PM

Re: [GSOC] Congrats to all students

2010-04-27 Thread Richard Simon Just
Thanks guys! So happy to get it, and really excited that Mahout got 5 slots. @Robin: I'm totally up for a shared blog, was planning on blogging about it anyway. Robin Anil wrote: > Congrats everyone.And a special thanks to Benson for helping us get the > slots to 5 this year :) > > For students

[jira] Commented: (MAHOUT-371) [GSoC] Proposal to implement Distributed SVD++ Recommender using Hadoop

2010-04-26 Thread Richard Simon Just (JIRA)
oaded the latest MEAP version of MiA yet, so that would great. Not sure if it has changed much but will re-read the version I have and start looking at a more detailed design, before consulting mahout-dev. > [GSoC] Proposal to implement Distributed SVD++ Recommender

[jira] Commented: (MAHOUT-371) [GSoC] Proposal to implement Distributed SVD++ Recommender using Hadoop

2010-04-26 Thread Sean Owen (JIRA)
tal model of how you'll set up the computation Hadoop. this is the tricky part and worth talking on mahout-dev. > [GSoC] Proposal to implement Distributed SVD++ Recommender using Hadoop > --- > >

[jira] Commented: (MAHOUT-371) [GSoC] Proposal to implement Distributed SVD++ Recommender using Hadoop

2010-04-26 Thread Richard Simon Just (JIRA)
I'm super excited! Thank you! Oh you're practically down the road. I'd love to meet up at some point after my exams. In the meantime, where do we go from here? Cheers RSJ > [GSoC] Proposal to implement Distributed SVD

[jira] Commented: (MAHOUT-371) [GSoC] Proposal to implement Distributed SVD++ Recommender using Hadoop

2010-04-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12861087#action_12861087 ] Sean Owen commented on MAHOUT-371: -- Looks like this was accept to GSoC, nice. Let

Re: [GSOC] Congrats to all students

2010-04-26 Thread Sisir Koppaka
2010 at 1:13 AM, Grant Ingersoll wrote: > Looks like student GSOC announcements are up ( > http://socghop.appspot.com/gsoc/program/list_projects/google/gsoc2010). > Mahout got quite a few projects (5) accepted this year, which is a true > credit to the ASF, Mahout, the mentors, and mo

[GSOC] Congrats to all students

2010-04-26 Thread Grant Ingersoll
Looks like student GSOC announcements are up (http://socghop.appspot.com/gsoc/program/list_projects/google/gsoc2010). Mahout got quite a few projects (5) accepted this year, which is a true credit to the ASF, Mahout, the mentors, and most of all the students! We had a good number of very

[jira] Commented: (MAHOUT-364) [GSOC] Proposal to implement Neural Network with backpropagation learning on Hadoop

2010-04-19 Thread Jake Mannix (JIRA)
, as this is a Google Summer of Code JIRA ticket. > [GSOC] Proposal to implement Neural Network with backpropagation learning on > Hadoop > --- > > Key: MAHOUT-364 >

[jira] Commented: (MAHOUT-364) [GSOC] Proposal to implement Neural Network with backpropagation learning on Hadoop

2010-04-13 Thread Jake Mannix (JIRA)
mething like the Apache License: http://www.apache.org/licenses/LICENSE-2.0 ? > [GSOC] Proposal to implement Neural Network with backpropagation learning on > Hadoop > --- > > Key:

[jira] Commented: (MAHOUT-364) [GSOC] Proposal to implement Neural Network with backpropagation learning on Hadoop

2010-04-13 Thread Benson Margulies (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856704#action_12856704 ] Benson Margulies commented on MAHOUT-364: - GPL3 is NOT ASL compatible. >

[jira] Commented: (MAHOUT-364) [GSOC] Proposal to implement Neural Network with backpropagation learning on Hadoop

2010-04-13 Thread Ted Dunning (JIRA)
See here: http://www.opensource.org/licenses/gpl-3.0.html and here: http://www.opensource.org/licenses/apache2.0.php > [GSOC] Proposal to implement Neural Network with backpropagation learning on &

[jira] Commented: (MAHOUT-364) [GSOC] Proposal to implement Neural Network with backpropagation learning on Hadoop

2010-04-13 Thread Zoran Sevarac (JIRA)
is proposal can be first common project. By the way is GPL3 Apache 2 compatible? > [GSOC] Proposal to implement Neural Network with backpropagation learning on > Hadoop > --- > > Key

Re: Mahout GSoC 2010: Association Mining

2010-04-13 Thread Neal Clark
nefit of Hadoop's location awareness. Thanks, Neal. On Sat, Apr 10, 2010 at 1:28 AM, Robin Anil wrote: > Like Ted said, its a bit late for a GSOC proposal, but I am excited at the > possibility of improving the frequent pattern mining package. Check out the > current Parallel FPGr

[jira] Commented: (MAHOUT-363) Proposal for GSoC 2010 (EigenCuts clustering algorithm for Mahout)

2010-04-12 Thread Shannon Quinn (JIRA)
ope this helps clarify some points and strengthen the overall feasibility of the proposal. > Proposal for GSoC 2010 (EigenCuts clustering algorithm for Mahout) > -- > > Key: MAHOUT-363 >

[jira] Commented: (MAHOUT-364) [GSOC] Proposal to implement Neural Network with backpropagation learning on Hadoop

2010-04-11 Thread David Strupl (JIRA)
some experimenting with parallel implementation of backpropagation and other algorithms. Check for example http://portal.acm.org/author_page.cfm?id=81100013265&coll=GUIDE&dl=GUIDE&trk=0&CFID=85691215&CFTOKEN=64441042 Sounds really interesting - all the best, David Strupl > [

[jira] Commented: (MAHOUT-364) [GSOC] Proposal to implement Neural Network with backpropagation learning on Hadoop

2010-04-11 Thread Zoran Sevarac (JIRA)
sing options and we dont have to discuss it here. You can count that we'll find some solution. > [GSOC] Proposal to implement Neural Network with backpropagation learning on > Hadoop > --- &g

[jira] Commented: (MAHOUT-364) [GSOC] Proposal to implement Neural Network with backpropagation learning on Hadoop

2010-04-11 Thread Jake Mannix (JIRA)
everse is obviously fine), but I'd love to see a tighter interaction here, given how little ANN code we have (and how much we'd *like* to have). > [GSOC] Proposal to implement Neural Network with backp

[jira] Commented: (MAHOUT-364) [GSOC] Proposal to implement Neural Network with backpropagation learning on Hadoop

2010-04-11 Thread Zoran Sevarac (JIRA)
k related stuff. So I can say that the Neuroph and me personally will support and help with the development of this project if it gets accepted. I allready published short article about this http://netbeans.dzone.com/neuroph-hadoop-nb > [GSOC] Proposal to implement Neural Network with backp

[jira] Issue Comment Edited: (MAHOUT-375) [GSOC] Restricted Boltzmann Machines in Apache Mahout

2010-04-11 Thread Sisir Koppaka (JIRA)
y datasets that you could suggest to me for quicker testing of the RBM - at least for now? If the test dataset has some results on RBM that I can compare with, that'd really help me with the testing. > [GSOC] Restricted Boltzman

[jira] Commented: (MAHOUT-375) [GSOC] Restricted Boltzmann Machines in Apache Mahout

2010-04-11 Thread Sisir Koppaka (JIRA)
f the RBM - at least for now? If the test dataset has some results on RBM that I can compare with, that'd really help me with the testing. > [GSOC] Restricted Boltzmann Machines in Apache Mahout > - > >

[jira] Created: (MAHOUT-375) [GSOC] Restricted Boltzmann Machines in Apache Mahout

2010-04-11 Thread Sisir Koppaka (JIRA)
[GSOC] Restricted Boltzmann Machines in Apache Mahout - Key: MAHOUT-375 URL: https://issues.apache.org/jira/browse/MAHOUT-375 Project: Mahout Issue Type: New Feature Reporter

Re: Mahout GSoC 2010: Association Mining

2010-04-10 Thread Robin Anil
Like Ted said, its a bit late for a GSOC proposal, but I am excited at the possibility of improving the frequent pattern mining package. Check out the current Parallel FPGrowth implementation in the code, you can find more explanation on usage the Mahout wiki. Apriori should be trivially

Re: Mahout GSoC 2010: Association Mining

2010-04-09 Thread Ted Dunning
Neal, I think that this might well be a useful contribution to Mahout, but, if I am not mistaken, I think that the deadline for student proposals for GSoC has just passed. That likely means that making this contribution an official GSoC project is not possible. I am sure that the Mahout

Mahout GSoC 2010: Association Mining

2010-04-09 Thread Neal Clark
such an implementation as a GSoC project? If so any comments/feedback would be very much appreciated. If you are interested I can create a proposal and submit it to your issue tracker when it comes back online. Thanks, Neal.

Re: Mahout GSoC 2010 proposal: Association Mining

2010-04-09 Thread Lukáš Vlček
Ted, do you think you can give some good links to paper or orther resources about mentioned approaches? I would like to look at it after the weekend. As far as I can see the association mining (and the guha method in its original form) is not meant to be a predictive method but rather data explora

[jira] Updated: (MAHOUT-371) [GSoC] Proposal to implement Distributed SVD++ Recommender using Hadoop

2010-04-09 Thread Richard Simon Just (JIRA)
enjoyed my Distributed Computing and Evolutionary Computation modules so much, and after reading all the introductory pages about the ASF I realised Mahout would be a great place to start. After graduation (and GSoC) I hope to continue contributing to Mahout while working in a related field. O

Re: Mahout GSoC 2010 proposal: Association Mining

2010-04-09 Thread Ted Dunning
Lukas, The strongest alternative for this kind of application (and the normal choice for large scale applications) is on-line gradient descent learning with an L_1 or L_1 + L_2 regularization. The typical goal is to predict some outcome (click or purchase or signup) from a variety of large vocabu

[jira] Created: (MAHOUT-374) GSOC 2010 Proposal Implement Map/Reduce Enabled Neural Networks (mahout-342)

2010-04-09 Thread Yinghua Hu (JIRA)
GSOC 2010 Proposal Implement Map/Reduce Enabled Neural Networks (mahout-342) - Key: MAHOUT-374 URL: https://issues.apache.org/jira/browse/MAHOUT-374 Project: Mahout

GSOC Create Sql adapters proposal

2010-04-09 Thread Necati Batur
very excited to join an organization like GSOC and most importantly work for a big open source Project apache.I am looking for a good collaboration and new challenges on software development.Especially information management issues sound great to me.I am confident to work with all new

Re: Mahout GSoC 2010 proposal: Association Mining

2010-04-09 Thread Robin Anil
Hi Lukáš, It would have been great if you could have participated in GSOC, there is time left. But you still have your proposal in the GSOC system. Take your time to decide, but if you choose not participate to do remove the application from the soc website. Wiki page for association

Re: Mahout GSoC 2010 proposal: Association Mining

2010-04-09 Thread Lukáš Vlček
Robin, I think it does not make sense for me to catch with GSoC timeline now as I am quite busy with other stuff. However, I will develop the proposal for Association Mining (or GUHA if you like) and keep this discussion going on. I am really interested in contributing some implementation to

Re: [GSOC] 2010 Timelines

2010-04-09 Thread Isabel Drost
Timeline including Apache internal deadlines: http://cwiki.apache.org/confluence/display/COMDEVxSITE/GSoC Mentors, please also click on the ranking link to the ranking explanation [1] for more information on how to rank student proposals. Isabel [1] http://cwiki.apache.org/confluence

[jira] Commented: (MAHOUT-363) Proposal for GSoC 2010 (EigenCuts clustering algorithm for Mahout)

2010-04-08 Thread Ted Dunning (JIRA)
are just how you get side-tracked after you start. :-) > Proposal for GSoC 2010 (EigenCuts clustering algorithm for Mahout) > -- > > Key: MAHOUT-363 > URL: https://issues.apache.or

[jira] Commented: (MAHOUT-363) Proposal for GSoC 2010 (EigenCuts clustering algorithm for Mahout)

2010-04-08 Thread Shannon Quinn (JIRA)
nged some of the wording; the overall proposal structure wasn't changed. But I will certainly refrain from editing the ticket itself. Are there any other suggestions for making the proposal more viable? > Proposal for GSoC 2010 (EigenCuts clustering algorith

[jira] Commented: (MAHOUT-363) Proposal for GSoC 2010 (EigenCuts clustering algorithm for Mahout)

2010-04-08 Thread Jake Mannix (JIRA)
add comments to this JIRA ticket, instead of editing the original ticket itself, we'll be able to more easily follow your thinking. Otherwise, we can't really see what has changed. > Proposal for GSoC 2010 (EigenCuts clustering alg

[jira] Updated: (MAHOUT-363) Proposal for GSoC 2010 (EigenCuts clustering algorithm for Mahout)

2010-04-08 Thread Shannon Quinn (JIRA)
pson. Half-Lives of EigenFlows for Spectral Clustering. NIPS 2002. > Proposal for GSoC 2010 (EigenCuts clustering algorithm for Mahout) > -- > > Key: MAHOUT-363 > URL: https://iss

Re: A request for prospective GSOC students

2010-04-08 Thread Richard Simon Just
Apr 3, 2010 at 9:07 PM, Robin Anil wrote: > I am having a tough time separating Mahout proposals from rest of Apache on > gsoc website. So I would request you all to reply to this thread when you > have submitted a proposal so that we don't miss out on reading your hard > worked pro

[jira] Created: (MAHOUT-371) [GSoC] Proposal to implement Distributed SVD++ Recommender using Hadoop

2010-04-08 Thread Richard Simon Just (JIRA)
[GSoC] Proposal to implement Distributed SVD++ Recommender using Hadoop --- Key: MAHOUT-371 URL: https://issues.apache.org/jira/browse/MAHOUT-371 Project: Mahout Issue Type

[GSoC 2010] Proposal to implement SimHash clustering for Mahout

2010-04-07 Thread Cristian Prodan
Hello, I'm posting a draft for my proposal for this year's GSoC. I kindly ask for your feedback on it. I have also posted a JIRA ticket with it: https://issues.apache.org/jira/browse/MAHOUT-365 . Thank you in advan

Re: A request for prospective GSOC students

2010-04-07 Thread Cristian Prodan
nt Neural Network with > backpropagation learning > Jira issue: http://issues.apache.org/jira/browse/MAHOUT-364 > > On Sat, Apr 3, 2010 at 9:07 PM, Robin Anil wrote: > > I am having a tough time separating Mahout proposals from rest of Apache > on > > gsoc website. So I wo

[jira] Created: (MAHOUT-365) [GSoC] Proposal to implement SimHash clustering on MapReduce

2010-04-07 Thread Cristi Prodan (JIRA)
[GSoC] Proposal to implement SimHash clustering on MapReduce Key: MAHOUT-365 URL: https://issues.apache.org/jira/browse/MAHOUT-365 Project: Mahout Issue Type: New Feature

[jira] Commented: (MAHOUT-364) [GSOC] Proposal to implement Neural Network with backpropagation learning on Hadoop

2010-04-06 Thread Ted Dunning (JIRA)
.net/nipsworkshops09_langford_pol/ He makes some inflammatory comments right off the bat that you might need to address. All that said, having a good implementation of an ANN learner is a good thing. > [GSOC] Proposal to implement Neural Network with backpropagation learning on &

[jira] Commented: (MAHOUT-364) [GSOC] Proposal to implement Neural Network with backpropagation learning on Hadoop

2010-04-06 Thread Jake Mannix (JIRA)
ally well written proposal, with perfect breadth of scope as well. Do we have someone who can shepherd this? > [GSOC] Proposal to implement Neural Network with backpropagation learning on > Hadoop > --- >

[GSoC 2010] Requesting feedback on my proposal for implementing Neural Network with backpropagation learning

2010-04-06 Thread Zaid Md Abdul Wahab Sheikh
Hi all, I just submitted a GSoC proposal for implementing Neural Network with backpropagation on Hadoop. Jira issue: http://issues.apache.org/jira/browse/MAHOUT-364 I would appreciate your feedback and comments on the proposal and on the working or implementation plan

Re: [GSOC] 2010 Timelines

2010-04-06 Thread Robin Anil
2 days to go till the close of student submissions. A request to mentors to provide feedback to all the queries on the list so that students can go and work on tuning their proposal Robin On Sat, Apr 3, 2010 at 10:50 PM, Grant Ingersoll wrote: > > http://socghop.appspot.com/document/show/gsoc_pr

Re: A request for prospective GSOC students

2010-04-06 Thread Zaid Md Abdul Wahab Sheikh
I just submitted a proposal to implement Neural Network with backpropagation learning Jira issue: http://issues.apache.org/jira/browse/MAHOUT-364 On Sat, Apr 3, 2010 at 9:07 PM, Robin Anil wrote: > I am having a tough time separating Mahout proposals from rest of Apache on > gsoc website

[jira] Updated: (MAHOUT-364) [GSOC] Proposal to implement Neural Network with backpropagation learning on Hadoop

2010-04-06 Thread Zaid Md. Abdul Wahab Sheikh (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zaid Md. Abdul Wahab Sheikh updated MAHOUT-364: --- Comment: was deleted (was: formatting :() > [GSOC] Proposal

[jira] Updated: (MAHOUT-364) [GSOC] Proposal to implement Neural Network with backpropagation learning on Hadoop

2010-04-06 Thread Zaid Md. Abdul Wahab Sheikh (JIRA)
et the overall batch gradient. - The final error gradient vector is written back to the FileSystem h3. I propose to complete all of the following sub-tasks during GSoC 2010: Implementation of the Backpropagation algorithm: - Initialization of weights: using the Nguyen-Widrow algorithm to select t

GSOC [mentor idea]: Clustering visualization with GraphViz

2010-04-06 Thread Robin Anil
Here is a good project wish list, If anyone wishes to take it forward I would be willing to help mentor. http://www.graphviz.org/ Check out one of the graphs which i believe is a good way to represent clusters. Creating this graph is as easy was writing cluster output to the graphviz format http:/

[jira] Created: (MAHOUT-364) [GSOC] Proposal to implement Neural Network with backpropagation learning on Hadoop

2010-04-06 Thread Zaid Md. Abdul Wahab Sheikh (JIRA)
[GSOC] Proposal to implement Neural Network with backpropagation learning on Hadoop --- Key: MAHOUT-364 URL: https://issues.apache.org/jira/browse/MAHOUT-364 Project

[jira] Commented: (MAHOUT-345) [GSOC] integrate Mahout with Drupal/PHP

2010-04-06 Thread Y.W.D.D.Dissanayake (JIRA)
tudent. i like to join your project. plz give more details about project. how to start following your project. > [GSOC] integrate Mahout with Drupal/PHP > --- > > Key: MAHOUT-345 > URL: https://issues.apache.or

Re: Mahout GSoC 2010 proposal: Association Mining

2010-04-05 Thread Robin Anil
> this method. Maybe starting from a transaction of shopping cart item ? A > great demo is big plus for a GSOC project. > > Robin > > > On Mon, Mar 29, 2010 at 1:46 AM, Lukáš Vlček wrote: > >> Hello, >> >> I would like to apply for Mahout GSoC 2010. My prop

Re: Mahout GSoC 2010 proposal: Association Mining

2010-04-05 Thread Robin Anil
lp us understand this method. Maybe starting from a transaction of shopping cart item ? A great demo is big plus for a GSOC project. Robin On Mon, Mar 29, 2010 at 1:46 AM, Lukáš Vlček wrote: > Hello, > > I would like to apply for Mahout GSoC 2010. My proposal is to implement > Association

[GSOC] Create adapters for MYSQL and NOSQL(hbase, cassandra) to access data for all the algorithms to use *

2010-04-05 Thread Robin Anil
wrote: > *IDEA:Create adapters for MYSQL and NOSQL(hbase, cassandra) to access data > for all the algorithms to use * > > *Summary* > > **First of all,I am very excited to join an organization like > GSOC and most importantly work for a big open source Project apach

Re: Reg. Netflix Prize Apache Mahout GSoC Application (SVD option)

2010-04-05 Thread Richard Simon Just
Awesome guys, Thanks for the quick responses! The details and clarifications are both helpful and incredibly reassuring. I've never done a proposal before, but no matter what happens I'm really looking forward to the end of my exams so I can gear into Mahout properly. Many thanks Richard Sean Ow

Re: Reg. Netflix Prize Apache Mahout GSoC Application (SVD option)

2010-04-05 Thread Necati Batur
*IDEA:Create adapters for MYSQL and NOSQL(hbase, cassandra) to access data for all the algorithms to use * *Summary* **First of all,I am very excited to join an organization like GSOC and most importantly work for a big open source Project apache.I am looking for a good collaboration

Re: Reg. Netflix Prize Apache Mahout GSoC Application (SVD option)

2010-04-05 Thread Sean Owen
Your audience is the project committers. I wouldn't spend much time rehashing the SVD theory. You should name your approach and I suppose write enough to make it clear you understand the algorithm enough to implement it. In this case you can assume we all understand the SVD well enough already. I

Re: Reg. Netflix Prize Apache Mahout GSoC Application (SVD option)

2010-04-05 Thread Jake Mannix
wrote: > > It'd be a matter of making a brand-new distributed recommender. It > > need not have anything to do with SVDRecommender, which is a fine but > > separate non-parallel implementation. > > > > Tacking on distributed slope-one is fairly easy, I think. Both &g

Re: Reg. Netflix Prize Apache Mahout GSoC Application (SVD option)

2010-04-05 Thread Richard Simon Just
r of making a brand-new distributed recommender. It > need not have anything to do with SVDRecommender, which is a fine but > separate non-parallel implementation. > > Tacking on distributed slope-one is fairly easy, I think. Both > together, with testing, documentation, etc. are cer

[jira] Commented: (MAHOUT-363) Proposal for GSoC 2010 (EigenCuts clustering algorithm for Mahout)

2010-04-05 Thread Shannon Quinn (JIRA)
ures, given its ease of implementation. That's just my explanation; if you feel otherwise I'm happy to adjust my proposal :) > Proposal for GSoC 2010 (EigenCuts clustering algorithm for Mahout) > -- > >

[jira] Commented: (MAHOUT-363) Proposal for GSoC 2010 (EigenCuts clustering algorithm for Mahout)

2010-04-05 Thread Robin Anil (JIRA)
hout code. I believe the k-means you are looking to implement is already there it will shave 2 weeks of your GSOC :). Reading the code/wiki is a great exercise for you to be more realistic in your proposal > Proposal for GSoC 2010 (EigenCuts clustering algorithm for

[jira] Updated: (MAHOUT-363) Proposal for GSoC 2010 (EigenCuts clustering algorithm for Mahout)

2010-04-05 Thread Shannon Quinn (JIRA)
ave limited experience with Apache Mahout and Hadoop, but with an undergraduate computer science degree from Georgia Tech, and after an internship with IBM ExtremeBlue, I feel I am extremely adept at picking up new frameworks quickly. References [1] Chakra Chennubhotla and Allan D. Jepson. Half-Liv

[jira] Commented: (MAHOUT-363) Proposal for GSoC 2010 (EigenCuts clustering algorithm for Mahout)

2010-04-04 Thread Jake Mannix (JIRA)
for a GSoC project. I wish I had the time to help with mentoring this project, in fact. > Proposal for GSoC 2010 (EigenCuts clustering algorithm for Mahout) > -- > > Key: MAHOUT-363 >

[jira] Commented: (MAHOUT-363) Proposal for GSoC 2010 (EigenCuts clustering algorithm for Mahout)

2010-04-04 Thread Shannon Quinn (JIRA)
Hama would certainly improve the feasibility of the project timeline and allow me to further refine the overall algorithm. I will absolutely adhere to your advice; I'll edit this ticket and my GSoC application. Thank you again! > Proposal for GSoC 2010 (EigenCuts clustering algorith

[jira] Commented: (MAHOUT-363) Proposal for GSoC 2010 (EigenCuts clustering algorithm for Mahout)

2010-04-04 Thread Ted Dunning (JIRA)
ived. One tiny suggestion is to omit Hama from your plan as it would just be a distraction for you. The Hama project is pretty much independent of Mahout and there hasn't any contribution in the H->M direction. > Proposal for GSoC 2010 (EigenCuts clustering algor

[jira] Created: (MAHOUT-363) Proposal for GSoC 2010 (EigenCuts clustering algorithm for Mahout)

2010-04-04 Thread Shannon Quinn (JIRA)
Proposal for GSoC 2010 (EigenCuts clustering algorithm for Mahout) -- Key: MAHOUT-363 URL: https://issues.apache.org/jira/browse/MAHOUT-363 Project: Mahout Issue Type: Task

Re: A request for prospective GSOC students

2010-04-04 Thread Shannon Quinn
wrote: I am having a tough time separating Mahout proposals from rest of Apache on gsoc website. So I would request you all to reply to this thread when you have submitted a proposal so that we don't miss out on reading your hard worked proposal. For now I could only find Zhao Zhendong&#

Re: A request for prospective GSOC students

2010-04-04 Thread Ted Dunning
Mellon and am finishing up my thesis, hence everything > is piecewise these days. :) > > I do welcome any feedback along the road leading up to April 9. Thank you > very much! > > Regards, > Shannon > > On 4/3/2010 11:37 AM, Robin Anil wrote: > >> I am having a to

Re: Reg. Netflix Prize Apache Mahout GSoC Application

2010-04-04 Thread Sisir Koppaka
I have put up the processed Netflix dataset here. This file does not contain dates, and is 1.5GB in size.

Re: A request for prospective GSOC students

2010-04-04 Thread Shannon Quinn
very much! Regards, Shannon On 4/3/2010 11:37 AM, Robin Anil wrote: I am having a tough time separating Mahout proposals from rest of Apache on gsoc website. So I would request you all to reply to this thread when you have submitted a proposal so that we don't miss out on reading your ha

Re: Reg. Netflix Prize Apache Mahout GSoC Application

2010-04-04 Thread Sisir Koppaka
On Sun, Apr 4, 2010 at 4:10 PM, Sean Owen wrote: > I think you want to write this to accept "generic" data, and not > necessarily assume the Netflix input format. I suggest you accept CSV > data, in the form "userID,itemID,value", since that is what all the > recommenders do. > > Sure, I'll write

Re: Reg. Netflix Prize Apache Mahout GSoC Application

2010-04-04 Thread Sean Owen
I think you want to write this to accept "generic" data, and not necessarily assume the Netflix input format. I suggest you accept CSV data, in the form "userID,itemID,value", since that is what all the recommenders do. You may need a quick utility program to convert Netflix data format to this. t

Re: Reg. Netflix Prize Apache Mahout GSoC Application

2010-04-04 Thread Sisir Koppaka
that would *be* utilize - sorry! I'll start off by implementing the distributed Netfflix read-in, if that's OK by you.

Re: Reg. Netflix Prize Apache Mahout GSoC Application

2010-04-04 Thread Sisir Koppaka
Thanks, this is what I wanted to know. So, now, there would be a separate example that reads-in the Netflix dataset in a distributed way, that would be utilize the RBM implementation. Would that be right? The datastore I was referring to in the proposal was based on mahout.classifier.bayes.datasto

Re: Reg. Netflix Prize Apache Mahout GSoC Application

2010-04-04 Thread Sean Owen
Reusing code is fine, in principle. The code you mention, however, will not help you much. It is non-distributed and has nothing to do with Hadoop. You might reuse a bit of code to parse the input files, that's about it. Which data store are you referring to... if I understand right, you are imple

Re: Reg. Netflix Prize Apache Mahout GSoC Application

2010-04-04 Thread Sisir Koppaka
Thanks Robin, Ted, Jake and Sean for your feedback. I've refined my proposal, added in a milestone timeline, with design details, and have submitted it at the GSoC site. The title of the proposal is *Restricted Boltzmann Machines on the Netflix Dataset. Please do give me your feedback o

Re: A request for prospective GSOC students

2010-04-03 Thread Lukáš Vlček
Hi, My proposal had the following subject: Mahout GSoC 2010 proposal: Association Mining It was missing time schedule and further implementation details. I can work on those missing parts but I was rather expecting some general discussion about this topic first before I invest time in time

Re: GSoC - Implementing SOM

2010-04-03 Thread Ted Dunning
SOM would be a great addition. You need to start with a good proposal that describes what you would like to do, how you will know it works and when you think you can do it. There are several examples available from previous years. On Sat, Apr 3, 2010 at 6:04 AM, hifsa kazmi wrote: > Dear Mahou

Re: A request for prospective GSOC students

2010-04-03 Thread yinghua hu
uggestions and advice are very welcome. I am still allowed to do >> correction on it before April 9th. >> >> Thank you! >> >> -- >> Regards, >> >> Yinghua >> >> >> On Sat, Apr 3, 2010 at 11:37 AM, Robin Anil wrote: >> > I

[GSOC] 2010 Timelines

2010-04-03 Thread Grant Ingersoll
http://socghop.appspot.com/document/show/gsoc_program/google/gsoc2010/faqs#timeline

Re: A request for prospective GSOC students

2010-04-03 Thread Robin Anil
ons and advice are very welcome. I am still allowed to do > correction on it before April 9th. > > Thank you! > > -- > Regards, > > Yinghua > > > On Sat, Apr 3, 2010 at 11:37 AM, Robin Anil wrote: > > I am having a tough time separating Mahout proposals from rest

Re: A request for prospective GSOC students

2010-04-03 Thread yinghua hu
ll allowed to do correction on it before April 9th. Thank you! -- Regards, Yinghua On Sat, Apr 3, 2010 at 11:37 AM, Robin Anil wrote: > I am having a tough time separating Mahout proposals from rest of Apache on > gsoc website. So I would request you all to reply to this thread when you

A request for prospective GSOC students

2010-04-03 Thread Robin Anil
I am having a tough time separating Mahout proposals from rest of Apache on gsoc website. So I would request you all to reply to this thread when you have submitted a proposal so that we don't miss out on reading your hard worked proposal. For now I could only find Zhao Zhendong's

GSoC - Implementing SOM

2010-04-03 Thread hifsa kazmi
Dear Mahout Developers, I am an undergraduate student, finishing my final year. For my final year project, I got to work on Hadoop MapReduce and HDFS; furthermore I also had to use clustering algorithms in Mahout on some of the datasets. One of my project mentors proposed to implement Self Organiz

Re: My ideas for GSoC 2010

2010-04-02 Thread Tanya Gupta
Hi There are couple of questions I would like to ask. 1. What type of clustering would you like me to use? Is K-Means good enough? 2.Can you tell me more about the map reduce code that you would have written. Or first do I need to implement that as well using Hadoop? Thanking You Tanya On Th

Re: Reg. Netflix Prize Apache Mahout GSoC Application (SVD option)

2010-04-01 Thread Sean Owen
ainly big enough for a GSoC project, probably a bit too large. I'd be pleased to see someone do a quite thorough job with an SVD-based recommender, and perhaps along the way analyzing and optimizing the SVD impl itself, and documenting and testing well and so on. That's a nice project IMHO.

Re: Reg. Netflix Prize Apache Mahout GSoC Application (SVD option)

2010-04-01 Thread Richard Simon Just
Just looking for some clarification. As a GSoC project would the SVD option mentioned below be a case of integrating the distributed SVD of MAHOUT-180 with the existing SVDRecommender? If so is there still a full GSoC project there? or would I need to combine it with say making the slope-one

Re: My ideas for GSoC 2010

2010-04-01 Thread Cristian Prodan
Thanks Robin, I will try have a look at that. Cristi. On Thu, Apr 1, 2010 at 9:36 AM, Robin Anil wrote: > Why dont you try it on 20 newsgroups. There are about 17-18 unique topics > and couple of overlapping ones. You can easily find issues with the > clustering code with that dataset. Once its

Re: My ideas for GSoC 2010

2010-03-31 Thread Robin Anil
Why dont you try it on 20 newsgroups. There are about 17-18 unique topics and couple of overlapping ones. You can easily find issues with the clustering code with that dataset. Once its done you can try bigger datasets like wikipedia Robin On Thu, Apr 1, 2010 at 12:02 PM, Cristian Prodan wrote:

Re: My ideas for GSoC 2010

2010-03-31 Thread Cristian Prodan
Hi, Can anyone please point me a good data set on which I might try SimHash clustering ? Thank you, Cristi On Tue, Mar 23, 2010 at 10:35 AM, cristi prodan wrote: > Hello again, > > First of all, thank you all for taking time to answer my ideas. Based on > your thoughts, I have been digging arou

Re: Application for GSOC 2010

2010-03-31 Thread Grant Ingersoll
On Mar 31, 2010, at 1:52 PM, Ted Dunning wrote: > File a JIRA issue with a detailed proposal of your project. The community > will help work out details for your proposal and it will eventually be rated > and possibly selected. Note, you also need to put your issue into the GSOC ap

Re: Application for GSOC 2010

2010-03-31 Thread Ted Dunning
, be reasonable about your schedule and the scope of your project, make sure that there are good ways to evaluate whether you have succeeded and put in good milestones. Keep in mind also that many good GSOC projects don't get selected for support. Mahout has had a strong policy in the past of he

Re: GSOC 2010

2010-03-31 Thread Robin Anil
Hi Tanya, MAHOUT-328 is just a general stub. There is no detailed project description other than what is given there. The idea is we let you propose to implement a clustering algorithm in Mahout. Start here http://cwiki.apache.org/MAHOUT/gsoc.html. Browse through the Wiki. Look at what

GSOC 2010

2010-03-31 Thread Tanya Gupta
Hi I would like a detailed project description for MAHOUT-328. Thanking You Tanya Gupta

Re: [GSOC] Wiki Page Added

2010-03-31 Thread zhao zhendong
Grant, > > > > Could you please give us the link of this page? > > > > Cheers, > > Zhendong > > > > On Wed, Mar 31, 2010 at 8:53 PM, Grant Ingersoll >wrote: > > > >> I created a Wiki page on GSOC. I hope everyone considering GSOC read

Re: [GSOC] Wiki Page Added

2010-03-31 Thread Grant Ingersoll
t; On Wed, Mar 31, 2010 at 8:53 PM, Grant Ingersoll wrote: > >> I created a Wiki page on GSOC. I hope everyone considering GSOC reads it. >> Mentors, please add as you see fit. Would be good to get a Mahout FAQ >> going to. Perhaps, Robin, Deneche and David would consider a

Re: [GSOC] Wiki Page Added

2010-03-31 Thread zhao zhendong
Hi Grant, Could you please give us the link of this page? Cheers, Zhendong On Wed, Mar 31, 2010 at 8:53 PM, Grant Ingersoll wrote: > I created a Wiki page on GSOC. I hope everyone considering GSOC reads it. > Mentors, please add as you see fit. Would be good to get a Mahout FAQ >

  1   2   3   4   >