On 03/26/2013 01:38 PM, Gilles wrote: >>> [...] >>>> >>>>> There is at least one contribution requiring some feedback (MATH-917). >>>> >>>> I have added a first patch and would love to get some feedback. >>> >>> It seems we are really close to releasing now. I guess MATH-817 will be >>> resolved soon. What seems to be achievable in a short time frame >>> would be: >>> >>> - wrapping up MATH-917 as Thomas proposed, >>> - finishing MATH-437 (Kolmogorov-Smirnov), and perhaps also >>> MATH-228 at the same time >>> - removing cobertura from parent >>> >>> I would propose to postpone MATH-873, MATH-923 and MATH-874 to 4.0. >>> >>> What do you think? >> >> Looks good. I will finish my work on MATH-917 till Wednesday the latest. > > I wonder whether "Clusterable" should be (the generic type seems > superfluous):
you are right, I did remove the generic type already in my local environment. > public interface Clusterable { > double[] getClusteringData(); > } > > And the "distance" interface simplified to: > > public interface Distance { > double compute(double[] a, double[] b); > } > > The concept being more general than just for clustering, the interface and > its implementations probably belong elsewhere, e.g. in "o.a.c.m.util" (or > perhaps even better as static inner classes of "MathArrays"). I thought about this too, but the downside would be that any Clusterable must have a double[] internally. Having dimension() and component(int) methods is more flexible imho, but both approaches have their benefits. > "AbstractClusterer" should be immutable (i.e. no distance setter). > > If using interfaces, the "getDistanceMeasure" method should probably appear > in one of them. Maybe: > > public interface Clusterer<T extends Clusterable> { > List<? extends Cluster<T>> cluster(Collection<T> points); > Distance getDistance(); > } > > However, in line with what we've been doing the last 3 or 4 years, > "Clusterer" could just be an abstract class. I was musing about making the Clusterer implementations immutable, or to create setters for their necessary parameters, e.g. the k parameter for k-means. Otherwise you have to create a new object each time you want to change something, would this be acceptable? Another thing that I changed locally: the KMeans clusterer currently supports multi-evaluation with a separate method. This could be moved to a separate class MultiKMeansPlusPlusClusterer, which allows to use it with the same Clusterer interface. Thomas --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org