Pseudo-Inverse map reduce implementation

2012-10-17 Thread Ranjith Uthaman
Hi, Does map reduce implementation of Pseudo-Inverse of a matrix exist in the current Mahout framework? What are the various ways to achieve it? Thanks & Regards, RANJITH P UTHAMAN

mahout 0.5 to 0.7 commandline parameter of lda

2012-10-17 Thread vineeth
Hello, I am seeing from this website http://theglassicon.com/computing/machine-learning/running-lda-algorithm-mahout (Mahout 0.5). This website give the complete procedure to get probabilities of word and topics using LDA. However, these steps donot work on Mahout 0.7. Can some one give an up

Documentation for ParallelALSFactorizationJob

2012-10-17 Thread Kris Jack
Hi all, I'm giving one of the distributed matrix factorisation implementations (code org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob) a try and have a few basic questions. I can't find much documentation about how to run it so can someone please point me in the right direction?

Re: Using model of mahout 0.7

2012-10-17 Thread paritosh ranjan
Your first confusion matrix looks too good to be true, which tells that there can be a target leak or some other problem in the model. I wanted to suggest some ModelDissector which you can use for analyzing the NaiveBayes model, however I just came to know that the ModelDissector in Mahout does no

Re: SGD: Logistic regression package in Mahout

2012-10-17 Thread Rajesh Nikam
Hello Ted, Thanks for investigating into it. I would look forward for further analysis and fix in SGD. I appreciate your efforts in looking into it. Thanks, Rajesh On Tue, Oct 16, 2012 at 10:23 PM, Ted Dunning wrote: > Rajesh, > > In the testing that I did, I ran 100, 1000 and 10,000 passes

Re: Using model of mahout 0.7

2012-10-17 Thread Priyadarshan Raj
Hi paritosh, As suggested by you I ran seq2sparse with arguments:- bin/mahout seq2sparse -i ${user-dir}/fact-seq -o ${user-dir}/fact-vectors -lnorm -nv -wt tfidf --maxDFSigma 3.0 --maxDFPercent 100 --minSupport 5 but still I am getting the same result.. As suggested by you to use -analyzerNam

Re: Mahout 0.7 API Naive Bayes

2012-10-17 Thread Thomas Quenolle
Hi I also struggled on this. Here is some code from my classifier, hope you will find this helpful. The key point is to use the dictionary you created on your process to train your model. Which I modified to get the total number of docs. private void loadTermDictionary(InputStream is) throws IOEx