For me there is no problem in including kmeans, but I should warn you that this implementation is far from optimal, for various reasons.
First, I have reused the class Point, while it would make more sense to do something that works in any dimension. Second, these examples are (intentionally) written in a particularly naive way. The main reason why is that I wanted to benchmark the speed of Pharo for interactive work. I assumed that during an interactive session, one would not bother to write an optimized implementation of an algorithm, but just use the collection's `groupBy:` method; also I did not want to make anything specific to kmeans, because I wanted the comparison to be meaningful for other algorithms where kmeans-specific optmizations would not apply. In the specific case of kmeans, there are a few easy optimizations to do: - first, since sqrt is increasing, all comparisons can be made with the squared distance, saving the sqrt operation - moreover, the main step of the algorithm requires to group points by their nearest centroid. In general a group-by operation requires a hashmap, but in this case there are a finite, known number of choices for the nearest centroid. It is more performant to just keep the index of the nearest centroid for each point, thereby avoiding to build a hashmap. In fact, it is more performant to accumulate the sum and the count of the points near the centroid while traversing, and then just perform a ratio at the end of the step. 2015-02-18 20:09 GMT+01:00 stepharo <steph...@free.fr>: > I thought that didier provided k-means clustering already. > > Le 18/2/15 11:24, SergeStinckwich a écrit : > >> Dear Andrea, >> thank you for your message. This is something really interesting for the >> sci-smalltalk project: >> https://github.com/SergeStinckwich/SciSmalltalk >> >> Do you mind to add your k-means code in Sci-Smalltalk distribution? >> >> Please feel free to join the sci-smalltalk for future discussions : >> https://groups.google.com/forum/#!forum/scismalltalk >> >> Regards, >> >> >> >> >> -- >> View this message in context: >> http://forum.world.st/Small-benchmark-tp4806092p4806309.html >> Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com. >> >> >> > >