Re: Why training data in Kmeans Spark streaming clustering

2016-08-11 Thread Bryan Cutler
The algorithm update is just broken into 2 steps: trainOn - to learn/update the cluster centers, and predictOn - predicts cluster assignment on data The StreamingKMeansExample you reference breaks up data into training and test because you might want to score the predictions. If you don't care

Why training data in Kmeans Spark streaming clustering

2016-08-11 Thread Ahmed Sadek
Dear All, I was wondering why there is training data and testing data in kmeans ? Shouldn't it be unsupervised learning with just access to stream data ? I found similar question but couldn't understand the answer.