date:20111109

Re: Collaborative filtering help needed

2011-11-09 Thread Akshay Jain

@Sean, I am just testing with a small dataset. I have some large datasets which I am planning to use on Hadoop. Thanks. Akshay On Wed, Nov 9, 2011 at 12:49 PM, Sean Owen sro...@gmail.com wrote: @Steven this is in the distributed part. There is no such method. But Akshay if your data is not

NewsKMeansClustering - the result most people want seems to be missing

2011-11-09 Thread Rob Podolski

Hi Managed to get the Manning Chap 09 example NewsKMeansClustering working with my own documents. However, I thought the main point of this was to cluster the news articles together to get groups of similar content. The example allows you to get the cluster membership in terms of

meanshift clustering

2011-11-09 Thread gaurav redkar

Hi.. I am unable to identify where is the clusterPoints() function in the MeanShiftCanopyClusterer.java file being called during the execution of Meanshift job. What i need to know is where are the files in clusteredPoints n clusters-* directory being written when we run the job on hadoop.

Re: Comparing results of Mahout SVD and Scilab

2011-11-09 Thread Alfredo Motta

Thank you for your clarifications, now it is clear 2011/11/8 Jake Mannix jake.man...@gmail.com The output from the LanczosSolver is not the final set of results. The fact that you passed --cleansvd true to the system means that you want it to do some cleanup and remove any spurious singular

new posting about (machine learning) mapreduce algorithms

2011-11-09 Thread Amund Tveit

Perhaps of interest: http://atbrox.com/2011/11/09/mapreduce-hadoop-algorithms-in-academic-papers-5th-update-%E2%80%93-nov-2011/ Best regards, Amund

AdaptiveLogisticRegression

2011-11-09 Thread Koert Kuipers

To train the AdaptiveLogisticRegression, do i need to feed in new training data only once? Or is iteration over the training data here helpful as well? Thanks! Koert

Issues with running Mahout LDA over the Reuters data set (Mahout in Action)

2011-11-09 Thread Varnit Khanna

Hi, I am trying to run the Mahout LDA over the Reuters data set as described in Mahout in Action however I always get only 1 topic returned. I am running on Mahout 0.5 and here are my steps: $ mvn -e -q exec:java -Dexec.mainClass=org.apache.lucene.benchmark.utils.ExtractReuters

Re: SGD TrainNewsGroups interim output

2011-11-09 Thread Grant Ingersoll

Cool, how about adding it to the Wiki? On Nov 9, 2011, at 8:15 AM, Suneel Marthi wrote: I can put together a doc if we don't already have one, know the SGD code pretty well. Regards, Suneel From: Grant Ingersoll grant.ingers...@gmail.com To:

Re: SGD TrainNewsGroups interim output

2011-11-09 Thread Suneel Marthi

Will do. From: Grant Ingersoll gsing...@apache.org To: user@mahout.apache.org; Suneel Marthi suneel_mar...@yahoo.com Sent: Wednesday, November 9, 2011 10:02 AM Subject: Re: SGD TrainNewsGroups interim output Cool, how about adding it to the Wiki? On Nov 9,

Re: Running Mahout SVD on Amazon Elastic Map Reduce

2011-11-09 Thread Ted Dunning

This looks like a hard-coded hdfs prefix in a path name construction somewhere. On Wed, Nov 9, 2011 at 8:27 AM, motta motta@gmail.com wrote: Hi everybody, I have tried to run my first Mahout SVD Job (DistributedLanczosSolver) in Elastic Map Reduce. Before going to Amazon I've tried to

Re: NewsKMeansClustering - the result most people want seems to be missing

2011-11-09 Thread Grant Ingersoll

On Nov 9, 2011, at 3:17 AM, Rob Podolski wrote: Hi Managed to get the Manning Chap 09 example NewsKMeansClustering working with my own documents. However, I thought the main point of this was to cluster the news articles together to get groups of similar content. The example

User based CF

2011-11-09 Thread WangRamon

Hi All Dose mahout provide a user based CF implementation on Hadoop? Currently i only see an item based hadoop implementations. Thanks. CheersRamon

Re: User based CF

2011-11-09 Thread Sebastian Schelter

There is no such implementation. Literature suggests that an item-based approach is usually both faster and more accurate. --sebastian On 10.11.2011 08:34, WangRamon wrote: Hi All Dose mahout provide a user based CF implementation on Hadoop? Currently i only see an item based hadoop

Re: NewsKMeansClustering - the result most people want seems to be missing

2011-11-09 Thread Rob Podolski

Many thanks. Actually I delved into the source code and found out that if you set the (undocumented) namedVector boolean to true in... DictionaryVectorizer.createTermFrequencyVectors( tokenizedPath, new Path(OUTPUT_HFS_FOLDER), conf,

RE: User based CF

2011-11-09 Thread WangRamon

Thanks Sebastian, can i assume that if there are more items than users, item based CF will be slow. Date: Thu, 10 Nov 2011 08:43:53 +0100 From: s...@apache.org To: user@mahout.apache.org Subject: Re: User based CF There is no such implementation. Literature suggests that an item-based

Re: Running Mahout SVD on Amazon Elastic Map Reduce

2011-11-09 Thread Alfredo Motta

I didn't hard-codec any hdfs prefix, I've just used mahout-examples-0.5-job.jar (downloaded from mahout website) to run DistributedLanczosSolver. The output suggest that the jar invoked FileSystem.get(conf) instead of FileSystem.get(uri, conf) to get my input matrix is it possible? 2011/11/10

Re: Collaborative filtering help needed

NewsKMeansClustering - the result most people want seems to be missing

meanshift clustering

Re: Comparing results of Mahout SVD and Scilab

new posting about (machine learning) mapreduce algorithms

AdaptiveLogisticRegression

Issues with running Mahout LDA over the Reuters data set (Mahout in Action)

Re: SGD TrainNewsGroups interim output

Re: SGD TrainNewsGroups interim output

Re: Running Mahout SVD on Amazon Elastic Map Reduce

Re: NewsKMeansClustering - the result most people want seems to be missing

User based CF

Re: User based CF

Re: NewsKMeansClustering - the result most people want seems to be missing

RE: User based CF

Re: Running Mahout SVD on Amazon Elastic Map Reduce

16 matches

Site Navigation

Mail list logo

Footer information