Did you check the website at
https://mahout.apache.org/users/clustering/k-means-clustering.html ?
On 04/13/2014 02:53 PM, Maciej Mazur wrote:
Recently I've been looking into K-means implementation.
I want to understand how it works, and why it was designed this way.
Could you give me some overview?
I see that during the setup clusters are read from the file. Is it a
distributed cache? What's the maxmial size of this file, what's the
maximum value of k?
There is nothing outputed during the call of map function, everything is
saved at cleanup. Why?
Are there any docs concerning implementation?
Thanks,
Maciej
On Wed, Apr 9, 2014 at 7:23 AM, Ted Dunning <ted.dunn...@gmail.com> wrote:
Well, you could view this as a performance bug in the implementation of
the linear algebra.
It certainly is, however, an odd interpretation of transpose. I have used
a similar trick in r to use sparse matrices as a counter but it always
worried me a bit.
Sent from my iPhone
On Apr 8, 2014, at 17:49, Dmitriy Lyubimov <dlie...@gmail.com> wrote:
Problem is, I want to use linear algebra to handle that, not
combine().