Re: Outlier detection/Pruning

2013-12-05 Thread Ted Dunning
You should move to 0.8 and explore ball k-means. On Tue, Dec 3, 2013 at 8:44 PM, Prabhakar Srinivasan prabhakar.sriniva...@gmail.com wrote: Hello I am using Mahout 0.7 currently and this question is pertaining to that version. I am using Canopy clustering (CanopyDriver class) first to

Outlier detection/Pruning

2013-12-03 Thread Prabhakar Srinivasan
Hello! Can someone point me to some explanatory documentation for Outlier Detection Removal in Clustering in Mahout. I am unable to understand the internal mechanism of outlier detection just by reading the Javadoc: clusterClassificationThreshold Is a clustering strictness / outlier removal

Re: Outlier detection/Pruning

2013-12-03 Thread Ted Dunning
Can you be more specific about which code you are asking about? The ball k-means implementation provides a capability somewhat like this, but perhaps in a more clearly defined way. On Tue, Dec 3, 2013 at 9:34 AM, Prabhakar Srinivasan prabhakar.sriniva...@gmail.com wrote: Hello! Can someone

Re: Outlier detection/Pruning

2013-12-03 Thread Dmitriy Lyubimov
On Tue, Dec 3, 2013 at 9:34 AM, Prabhakar Srinivasan prabhakar.sriniva...@gmail.com wrote: Hello! Can someone point me to some explanatory documentation for Outlier Detection Removal in Clustering in Mahout. I am unable to understand the internal mechanism of outlier detection just by

Re: Outlier detection/Pruning

2013-12-03 Thread Prabhakar Srinivasan
Hello I am using Mahout 0.7 currently and this question is pertaining to that version. I am using Canopy clustering (CanopyDriver class) first to determine the optimal number of clusters that best fits the dataset and passing that information as parameter to Kmeans clustering (kmeansDriver