Re: Ask

Paritosh Ranjan Mon, 16 Apr 2012 08:29:05 -0700

The simplest way would be :
1) Create a maven project.
2) Add mahout-core as a dependency.
3) Use KMeansDriver's run method.

If you set parameter runSequential=true, then you don't even needhadoop cluster, but you will not be able to cluster really large datasets.

So, try it out with a smaller number of records(vectors) first, then gofor hadoop cluster ( by setting runSequential=false and setting upHADOOP_HOME) .


You can find the documentation for KMeans here:
https://cwiki.apache.org/confluence/display/MAHOUT/K-Means+Clustering

You can also start with Canopy Clustering and then go for K-Means, asits simpler and fast.

https://cwiki.apache.org/confluence/display/MAHOUT/Canopy+Clustering

I recommend mahout 0.7-snapshot ( i.e. the current code at trunk ).

On 16-04-2012 10:57, OSCAR wrote:

Hello

My name is Oscar González, i'm studing System Engeineer in the universidad el 
Bosque from Colombia. And i have the next question:

I have a web aplication with hibernate. And i need use clustering, kmeans 
algorithm. I wanna use mahout, but I don't know, how can I apply mahout in my 
project... I'm using netbeans. Please answer me,

Thanks

Oscar Miguel Gonzalez

Enviado desde mi iPad

Re: Ask

Reply via email to