Use of streaming K Means

Marko Dinić Thu, 02 Oct 2014 02:56:33 -0700

Hello again,

I have again a question about Streaming K Means, concretely about itsuse. From the paperworks that are given I understood that one of itsadvantages is that does one pass clustering and in that way can decreasenumber of iterations and work with large datasets.

What I'm interested in is - can I use it in online fashion? If I havedata streaming from some data, can I use it to cluster incoming data insome way?

I understand that there is streaming step, that in some way looks moreor less appropriate for my incoming data, but then there is a Ball KMeans step, that is performed after streaming step. The question thatarrives is - when to do Ball K Means step, since the data arrives allthe time...


Should I even consider this, or should I go for lambda architecture?

Any help would be great.

Thanks,
Marko

Use of streaming K Means

Reply via email to