Re: Mahout 0.9 Release

2014-01-31 Thread Tharindu Rusira
On Wed, Jan 29, 2014 at 6:15 AM, Suneel Marthi wrote: > Fixed the issues that were reported with Clustering code this past week, > upgraded codebase to Lucene 4.6.1 that was released today. > > Here's the URL for the 0.9 release in staging:- > > https://repository.apache.org/content/repositories/o

Re: Mahout 0.9 Release

2014-01-31 Thread Suneel Marthi
Thanks Ted and Dmitriy. The release can officially pass now. I need help with pushing the artifacts to the mirrors, not sure how to go about it. The ASF documentation at https://www.apache.org/dev/release-publishing.html#signed asks to use svnpubsub but seems like Mahout's not setup for that a

Re: Mahout 0.9 Release

2014-01-31 Thread Ted Dunning
+1 I checked sigs, compiled the code, ran tests. Platform is MacBook, OSX 10.7.5 JDK 1.7.0_11-b21 On Fri, Jan 31, 2014 at 5:32 PM, Suneel Marthi wrote: > Thanks Dmitriy. That makes it +2 > > Sent from my iPhone > > > On Jan 31, 2014, at 8:13 PM, Dmitriy Lyubimov wrote: > > > > +1. > > >

Re: Mahout 0.9 Release

2014-01-31 Thread Suneel Marthi
Thanks Dmitriy. That makes it +2 Sent from my iPhone > On Jan 31, 2014, at 8:13 PM, Dmitriy Lyubimov wrote: > > +1. > > Some specific parts I am concerned about look good. > > -d > > >> On Tue, Jan 28, 2014 at 4:45 PM, Suneel Marthi >> wrote: >> Fixed the issues that were reported with

Re: Mahout 0.9 Release

2014-01-31 Thread Dmitriy Lyubimov
+1. Some specific parts I am concerned about look good. -d On Tue, Jan 28, 2014 at 4:45 PM, Suneel Marthi wrote: > Fixed the issues that were reported with Clustering code this past week, > upgraded codebase to Lucene 4.6.1 that was released today. > > Here's the URL for the 0.9 release in sta

RE: Using Mahout to cluster a large CSV file

2014-01-31 Thread Allen, Ronald L.
Thank you for the response! I will try this out and let you know how it goes! From: Suneel Marthi [suneel_mar...@yahoo.com] Sent: Friday, January 31, 2014 8:17 AM To: user@mahout.apache.org Subject: Re: Using Mahout to cluster a large CSV file Use Mahout's

Re: Using Mahout to cluster a large CSV file

2014-01-31 Thread Bertrand Dechoux
I guess the big (no pun intended) question is what is your definition of a large CSV. Bertrand On Fri, Jan 31, 2014 at 2:17 PM, Suneel Marthi wrote: > Use Mahout's CSVVectorIterator.java to read ur input CSV file and generate > vectors. > > You pass in a java.io.Reader to your CSV file and it g

Re: Using Mahout to cluster a large CSV file

2014-01-31 Thread Suneel Marthi
Use Mahout's CSVVectorIterator.java to read ur input CSV file and generate vectors. You pass in a java.io.Reader to your CSV file and it generates Dense Vectors (from CSV). U could then feed the generated vectors into KMeans clustering. On Friday, January 31, 2014 7:55 AM, "Allen, Ronald L.

Using Mahout to cluster a large CSV file

2014-01-31 Thread Allen, Ronald L.
Hi all, Has anyone had any success using Mahout kmeans to cluster a data in a single large CSV file? If so, how did you do it? Thanks, Ronnie

How to implement "distance(double centroidLengthSquared, Vector centroid, Vector v)"

2014-01-31 Thread Rob Podolski
Hi I'm wondering what the exact contract of this method is.  (Saw http://comments.gmane.org/gmane.comp.apache.mahout.user/10116) Is it supposed to calculate the distance between a cluster centroid and vector v?  In which case, for non-sparse vectors can it be implemented by calling the other d