Hello,

i'm running some clustering with the Mean Shift and in my final canopy i 
get 5x the same vector.

In the original input list i only had it once and i'm wondering why 
duplicates are allowed within the same canopy?

Attached is a file with the method i'm using to run mean shift as well 
as the ouput (i'm iterating over the getBoundPoints() list of the 
canopy).

I'd be happy if someone could explain this.

regards
Christoph Hermann

-- 
Christoph Hermann
Institut für Informatik
Tel: +49 761-203-8171 Fax: +49 761-203-8162
e-mail: [email protected]
public static List<MeanShiftCanopy> runMeanShift(
		List<MeanShiftCanopy> canopies, Map<Long, Vector> vectors,
		DistanceMeasure aMeasure, double aT1, double aT2, double aDelta) {
	List<MeanShiftCanopy> canopiesResult = canopies;
	MeanShiftCanopy.config(aMeasure, aT1, aT2, aDelta);
	// add all points to the canopies
	for (Vector aRaw : vectors.values()) {
		MeanShiftCanopy.mergeCanopy(new MeanShiftCanopy(aRaw), canopiesResult);
	}
	boolean done = false;
	while (!done) { // shift canopies to their centroids
		done = true;
		List<MeanShiftCanopy> migratedCanopies = new ArrayList<MeanShiftCanopy>();
		for (MeanShiftCanopy canopy : canopiesResult) {
			done = canopy.shiftToMean() && done;
			MeanShiftCanopy.mergeCanopy(canopy, migratedCanopies);
		}
		canopiesResult = migratedCanopies;
	}
	return canopiesResult;
}


Vectors v: {"class":"org.apache.mahout.matrix.DenseVector","vector":"{\"values\":[5.0,10.0,2.0,4.0,2.0,5.0,7.0],\"lengthSquared\":-1.0,\"name\":\"6407\"}"}
List of other Vectors in same Canopy as Vector v: {"class":"org.apache.mahout.matrix.DenseVector","vector":"{\"values\":[5.0,10.0,2.0,4.0,2.0,5.0,7.0],\"lengthSquared\":-1.0,\"name\":\"6407\"}"}
Vector: {"class":"org.apache.mahout.matrix.DenseVector","vector":"{\"values\":[5.0,10.0,2.0,4.0,2.0,5.0,7.0],\"lengthSquared\":-1.0,\"name\":\"6407\"}"}
Vector: {"class":"org.apache.mahout.matrix.DenseVector","vector":"{\"values\":[5.0,10.0,2.0,4.0,2.0,5.0,7.0],\"lengthSquared\":-1.0,\"name\":\"6407\"}"}
Vector: {"class":"org.apache.mahout.matrix.DenseVector","vector":"{\"values\":[5.0,10.0,2.0,4.0,2.0,5.0,7.0],\"lengthSquared\":-1.0,\"name\":\"6407\"}"}
Vector: {"class":"org.apache.mahout.matrix.DenseVector","vector":"{\"values\":[5.0,10.0,2.0,4.0,2.0,5.0,7.0],\"lengthSquared\":-1.0,\"name\":\"6407\"}"}
Vector: {"class":"org.apache.mahout.matrix.DenseVector","vector":"{\"values\":[5.0,10.0,2.0,4.0,2.0,5.0,7.0],\"lengthSquared\":-1.0,\"name\":\"6407\"}"}
We have 5 no of points in the same canopy: 6407, 6407, 6407, 6407, 6407

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to