RE: K-means: No input clusters found

2013-12-24 Thread sstilak
Hi Suneel, Thanks for your help. Sent via the Samsung GALAXY S®4, an AT&T 4G LTE smartphone Original message From: Suneel Marthi Date: 12/24/2013 12:34 PM (GMT-08:00) To: user@mahout.apache.org Subject: Re: K-means: No input clusters found kmeans-init-clusters should be in

Re: SVM in Mahout

2013-12-24 Thread Ted Dunning
Logistic regression with L1 regularization is generally at least as good as SVM. The problem with SVM is that it uses radially symmetric regularization which doesn't learn sparse solutions very well. L1 regularization is much better for that. On Tue, Dec 24, 2013 at 10:06 AM, Steven Bourke wro

Re: K-means: No input clusters found

2013-12-24 Thread Suneel Marthi
kmeans-init-clusters should be in a file with a name like 'part-' and not the way you have it (kmeans-init-clusters). On Tuesday, December 24, 2013 2:15 PM, Sameer Tilak wrote: Hi all, I get the following problem whehn I run k-mens clustering on my real data. Any ehlp with this would

K-means: No input clusters found

2013-12-24 Thread Sameer Tilak
Hi all, I get the following problem whehn I run k-mens clustering on my real data. Any ehlp with this would be great! Here is data that I read out of the Sequencefile: 022960 value: 022960:{269830:1.0,2042:1.0,145659:1.0,143547:1.0,219265:1.0,321251:1.0,202350:1.0,258610:1.0,239068:1.0,2591

Re: SVM in Mahout

2013-12-24 Thread Steven Bourke
Just test out libsvm against log regression on a sample of your data to get an understanding of upside downside for your particular problem Sent from my iPhone > On 24 Dec 2013, at 15:55, Tharindu Rusira wrote: > > Thanks all for the words of wisdom :) , > > @Ted, I'm coming from a text mini

Re: SVM in Mahout

2013-12-24 Thread Tharindu Rusira
Thanks all for the words of wisdom :) , @Ted, I'm coming from a text mining background. Many text books recommend SVM because of its impressive performance with vectors having a larger cardinality which is the usual case when dealing with text documents. Do you think logistic regression would perf

Re: SVM in Mahout

2013-12-24 Thread unmesha sreeveni
You can paralize svm using same equations (which has slight difference) explained in http://books.google.co.in/books/about/DATA_MINING.html?id=IYc2muhCbmEC&redir_esc=y But i dont gaurentee about the performance. for some 100 MB data it takes 10 min to train the data. On Tue, Dec 24, 2013 at 3:30

Re: SVM in Mahout

2013-12-24 Thread tuku
someone tried to implement SVM in a summer google code but it turns out map reduced version of svm is too difficult to implement and they dropped the project. I bet you can train via libsvm and use just classification part with map reduce but if I have a choice I prefer logistic regression too ~--

Re: SVM in Mahout

2013-12-24 Thread Ted Dunning
You might try logistic regression with regularization for a very similar result. On Mon, Dec 23, 2013 at 11:57 PM, Sebastian Schelter < ssc.o...@googlemail.com> wrote: > Hi Tharindu, > > There is no SVM implementation in an official release. > > --sebastian > > On 24.12.2013 08:02, Tharindu Rusi