[jira] Commented: (MAHOUT-328) Implement a cool clustering algorithm on map/reduce

2010-03-31 Thread Robin Anil (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851742#action_12851742 ] Robin Anil commented on MAHOUT-328: --- Subscribe to the mahout dev mailing list

[jira] Commented: (MAHOUT-344) Minhash based clustering

2010-03-31 Thread Ankur (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851756#action_12851756 ] Ankur commented on MAHOUT-344: -- Drew, thanks for pitching in as I've been running super busy

[jira] Commented: (MAHOUT-353) java.lang.NullPointerException in RecommenderMapper

2010-03-31 Thread Hui Wen Han (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851765#action_12851765 ] Hui Wen Han commented on MAHOUT-353: I run again ,get following error:

[jira] Commented: (MAHOUT-353) java.lang.NullPointerException in RecommenderMapper

2010-03-31 Thread Hui Wen Han (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851781#action_12851781 ] Hui Wen Han commented on MAHOUT-353: I run a few times ,get following error:

[jira] Commented: (MAHOUT-353) java.lang.NullPointerException in RecommenderMapper

2010-03-31 Thread Hui Wen Han (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851789#action_12851789 ] Hui Wen Han commented on MAHOUT-353: org.apache.mahout.cf.taste.impl.common.Cache

[jira] Issue Comment Edited: (MAHOUT-353) java.lang.NullPointerException in RecommenderMapper

2010-03-31 Thread Hui Wen Han (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851789#action_12851789 ] Hui Wen Han edited comment on MAHOUT-353 at 3/31/10 9:45 AM: -

[jira] Issue Comment Edited: (MAHOUT-353) java.lang.NullPointerException in RecommenderMapper

2010-03-31 Thread Hui Wen Han (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851789#action_12851789 ] Hui Wen Han edited comment on MAHOUT-353 at 3/31/10 9:46 AM: -

[jira] Created: (MAHOUT-354) make the output of RecommenderJob more readable

2010-03-31 Thread Hui Wen Han (JIRA)
make the output of RecommenderJob more readable --- Key: MAHOUT-354 URL: https://issues.apache.org/jira/browse/MAHOUT-354 Project: Mahout Issue Type: Improvement Components:

[jira] Commented: (MAHOUT-354) make the output of RecommenderJob more readable

2010-03-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851813#action_12851813 ] Sean Owen commented on MAHOUT-354: -- I understand your point, though I'm reluctant to

[jira] Commented: (MAHOUT-353) java.lang.NullPointerException in RecommenderMapper

2010-03-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851815#action_12851815 ] Sean Owen commented on MAHOUT-353: -- I can change Cache to handle null values, but I think

[jira] Updated: (MAHOUT-354) make the output of RecommenderJob more readable

2010-03-31 Thread Hui Wen Han (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Wen Han updated MAHOUT-354: --- Attachment: screenshot-1.jpg the output like this make the output of RecommenderJob more readable

[jira] Commented: (MAHOUT-354) make the output of RecommenderJob more readable

2010-03-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851824#action_12851824 ] Sean Owen commented on MAHOUT-354: -- You're dumping the compressed file as raw bytes --

[jira] Commented: (MAHOUT-350) add one JobName and reduceNumber parameter to org.apache.mahout.cf.taste.hadoop.item.RecommenderJob

2010-03-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851827#action_12851827 ] Sean Owen commented on MAHOUT-350: -- If you don't mind, try this again? I changed

[jira] Commented: (MAHOUT-353) java.lang.NullPointerException in RecommenderMapper

2010-03-31 Thread Hui Wen Han (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851828#action_12851828 ] Hui Wen Han commented on MAHOUT-353: there has one situation: some user only has one

[jira] Issue Comment Edited: (MAHOUT-353) java.lang.NullPointerException in RecommenderMapper

2010-03-31 Thread Hui Wen Han (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851828#action_12851828 ] Hui Wen Han edited comment on MAHOUT-353 at 3/31/10 12:08 PM: --

[jira] Commented: (MAHOUT-354) make the output of RecommenderJob more readable

2010-03-31 Thread Hui Wen Han (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851833#action_12851833 ] Hui Wen Han commented on MAHOUT-354: I already comment line following file for test.

[jira] Commented: (MAHOUT-350) add one JobName and reduceNumber parameter to org.apache.mahout.cf.taste.hadoop.item.RecommenderJob

2010-03-31 Thread Hui Wen Han (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851835#action_12851835 ] Hui Wen Han commented on MAHOUT-350: Very thanks :) , I try it later. add one

[jira] Created: (MAHOUT-355) Misleading JavaDoc comment in FPGrowth

2010-03-31 Thread Sebastian Schelter (JIRA)
Misleading JavaDoc comment in FPGrowth -- Key: MAHOUT-355 URL: https://issues.apache.org/jira/browse/MAHOUT-355 Project: Mahout Issue Type: Bug Components: Frequent Itemset/Association Rule

Application for GSOC 2010

2010-03-31 Thread Tanya Gupta
Hi I want to work under MAHOUT-328 for my GSOC 2010 project.How do I apply? Thanking You Tanya

[jira] Issue Comment Edited: (MAHOUT-308) Improve Lanczos to handle extremely large feature sets (without hashing)

2010-03-31 Thread Danny Leshem (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851843#action_12851843 ] Danny Leshem edited comment on MAHOUT-308 at 3/31/10 12:39 PM:

[jira] Commented: (MAHOUT-308) Improve Lanczos to handle extremely large feature sets (without hashing)

2010-03-31 Thread Danny Leshem (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851843#action_12851843 ] Danny Leshem commented on MAHOUT-308: - Currently a blocker for me, with

Re: [GSOC] Wiki Page Added

2010-03-31 Thread zhao zhendong
Hi Grant, Could you please give us the link of this page? Cheers, Zhendong On Wed, Mar 31, 2010 at 8:53 PM, Grant Ingersoll gsing...@apache.orgwrote: I created a Wiki page on GSOC. I hope everyone considering GSOC reads it. Mentors, please add as you see fit. Would be good to get a Mahout

Re: [GSOC] Wiki Page Added

2010-03-31 Thread Grant Ingersoll
D'oh! My bad: http://cwiki.apache.org/MAHOUT/gsoc.html. It's linked from the front wiki page under community. -Grant On Mar 31, 2010, at 9:11 AM, zhao zhendong wrote: Hi Grant, Could you please give us the link of this page? Cheers, Zhendong On Wed, Mar 31, 2010 at 8:53 PM, Grant

[jira] Commented: (MAHOUT-354) make the output of RecommenderJob more readable

2010-03-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851866#action_12851866 ] Sean Owen commented on MAHOUT-354: -- Hm, there's no reason it should not work with a text

Re: [GSOC] Wiki Page Added

2010-03-31 Thread zhao zhendong
Ha, thanks. On Wed, Mar 31, 2010 at 9:29 PM, Grant Ingersoll gsing...@apache.orgwrote: D'oh! My bad: http://cwiki.apache.org/MAHOUT/gsoc.html. It's linked from the front wiki page under community. -Grant On Mar 31, 2010, at 9:11 AM, zhao zhendong wrote: Hi Grant, Could you please

[jira] Commented: (MAHOUT-354) make the output of RecommenderJob more readable

2010-03-31 Thread Hui Wen Han (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851900#action_12851900 ] Hui Wen Han commented on MAHOUT-354: I try it again make the output of RecommenderJob

GSOC 2010

2010-03-31 Thread Tanya Gupta
Hi I would like a detailed project description for MAHOUT-328. Thanking You Tanya Gupta

Re: GSOC 2010

2010-03-31 Thread Robin Anil
Hi Tanya, MAHOUT-328 is just a general stub. There is no detailed project description other than what is given there. The idea is we let you propose to implement a clustering algorithm in Mahout. Start here http://cwiki.apache.org/MAHOUT/gsoc.html. Browse through the Wiki. Look at

[jira] Issue Comment Edited: (MAHOUT-354) make the output of RecommenderJob more readable

2010-03-31 Thread Hui Wen Han (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851900#action_12851900 ] Hui Wen Han edited comment on MAHOUT-354 at 3/31/10 3:38 PM: - I 

[jira] Commented: (MAHOUT-350) add one JobName and reduceNumber parameter to org.apache.mahout.cf.taste.hadoop.item.RecommenderJob

2010-03-31 Thread Hui Wen Han (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851912#action_12851912 ] Hui Wen Han commented on MAHOUT-350: get following error: 10/03/31 10:34:12 WARN

question about implementing the k-median clustering algorithm based on Map Reduce

2010-03-31 Thread Guohua Hao
Hello All, Sorry for the cross posting of this question. I would like to implement the k-median clustering algorithm using map reduce. The k-median clustering algorithm mentioned here is very close to the k-means, except that the centroid of a cluster in the k-median algorithm is the median of

[jira] Issue Comment Edited: (MAHOUT-350) add one JobName and reduceNumber parameter to org.apache.mahout.cf.taste.hadoop.item.RecommenderJob

2010-03-31 Thread Hui Wen Han (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851912#action_12851912 ] Hui Wen Han edited comment on MAHOUT-350 at 3/31/10 3:43 PM: -

[jira] Issue Comment Edited: (MAHOUT-350) add one JobName and reduceNumber parameter to org.apache.mahout.cf.taste.hadoop.item.RecommenderJob

2010-03-31 Thread Hui Wen Han (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851912#action_12851912 ] Hui Wen Han edited comment on MAHOUT-350 at 3/31/10 3:57 PM: -

[jira] Commented: (MAHOUT-350) add one JobName and reduceNumber parameter to org.apache.mahout.cf.taste.hadoop.item.RecommenderJob

2010-03-31 Thread Hui Wen Han (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851920#action_12851920 ] Hui Wen Han commented on MAHOUT-350: it work now. need

[jira] Commented: (MAHOUT-350) add one JobName and reduceNumber parameter to org.apache.mahout.cf.taste.hadoop.item.RecommenderJob

2010-03-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851927#action_12851927 ] Sean Owen commented on MAHOUT-350: -- I'll have to look at the Hadoop source code. I thought

[jira] Commented: (MAHOUT-350) add one JobName and reduceNumber parameter to org.apache.mahout.cf.taste.hadoop.item.RecommenderJob

2010-03-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851931#action_12851931 ] Sean Owen commented on MAHOUT-350: -- Hmm, I don't understand that. AbstractJob.class and

[jira] Commented: (MAHOUT-354) make the output of RecommenderJob more readable

2010-03-31 Thread Hui Wen Han (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851959#action_12851959 ] Hui Wen Han commented on MAHOUT-354: I run the test several times,It 's also can not

Re: [jira] Created: (MAHOUT-355) Misleading JavaDoc comment in FPGrowth

2010-03-31 Thread Ted Dunning
Sebastian, can you post a patch with suggested language? On Wed, Mar 31, 2010 at 5:18 AM, Sebastian Schelter (JIRA) j...@apache.orgwrote: Misleading JavaDoc comment in FPGrowth -- Key: MAHOUT-355 URL:

Re: Application for GSOC 2010

2010-03-31 Thread Ted Dunning
File a JIRA issue with a detailed proposal of your project. The community will help work out details for your proposal and it will eventually be rated and possibly selected. Make sure you follow the guidelines for a proposal. In particular, describe the problem and your proposed work clearly,

Re: question about implementing the k-median clustering algorithm based on Map Reduce

2010-03-31 Thread Ted Dunning
The problem is is that k-median (and more generally, k-medoid as well) lacks what are called sufficient statistics. Informally, sufficient statistics are a summary of the data that you have seen so far that allows you to process new data and compute the value you like. For the mean and variance,

[jira] Commented: (MAHOUT-350) add one JobName and reduceNumber parameter to org.apache.mahout.cf.taste.hadoop.item.RecommenderJob

2010-03-31 Thread Drew Farris (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12852012#action_12852012 ] Drew Farris commented on MAHOUT-350: Not sure if this is helpful Sean,

Re: Application for GSOC 2010

2010-03-31 Thread Grant Ingersoll
On Mar 31, 2010, at 1:52 PM, Ted Dunning wrote: File a JIRA issue with a detailed proposal of your project. The community will help work out details for your proposal and it will eventually be rated and possibly selected. Note, you also need to put your issue into the GSOC application. I

[jira] Commented: (MAHOUT-350) add one JobName and reduceNumber parameter to org.apache.mahout.cf.taste.hadoop.item.RecommenderJob

2010-03-31 Thread Drew Farris (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12852224#action_12852224 ] Drew Farris commented on MAHOUT-350: {quote} Hmm, I don't understand that.