Hi all,
I am momentarily experimenting with Silhouette from the cluster library but I am getting some errors. Since Silhouette can be seen as a quality measure for a clustering what I want to do is run a series of different clusterings and store the one with the highest Silhouette value. In that way I hope to get "the best" clustering possible for my dataset.
Here is the problem:
When running the examples that come with silhouette, everything works fine, the silhouette values are calculated perfectly. When I try to run silhouette with my own dataset I get errors at unpredictable times, that is, sometimes silhouette runs succesfully and at other times it gives me the following error:
> test <- silhouette(cutree(agn, k=5), daisy(bestSom$codes))
Error in apply(dmatrix[!iC, iC], 2, function(r) tapply(r, x[!iC], mean)) :
dim(X) must have a positive length


Since I am running my experiments in batch mode (put a loop of experiments in a source file and then load this source file), whenever this error occurs the entire experiment is cut off. The experiment takes rather a long time (approx. 12 hours), so I would not want to start my experiment at night only to find in the morning that my experiment never ran. Is there a way to
a) prevent the error from happening, or
b) detect beforehand that the error will happen and thus not do the silhouette calculation for that particular clustering


Any help with this is much appreciated,
thanks, Jonck

______________________________________________
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Reply via email to