hi all, I need some guides that explain how to use mahout with the kmeans algorithm and first of all,what type of dataset mahout uses? I'm doing my thesis and I must run a k means clustering on weka,but weka must call hadoop in background to parallelize the job. I discovered that mahout run the kmeans on hadoop so i will call it from weka,but I don't understand what type of files the kmeans of mahout read as input and how it works.
can someone help me? Thanks all, Valerio Ceraudo
