Hi Josh, I too am working on clustering time-series-data, and basically trying to come up with a sequence clustering model. Would like to know how you intend to use K-means to achieve that. Are you treating each sequence as a point ? Then, what would be your vector representation of a sequence and also more importantly which metric ( distance computation logic ) will you be using ?
BTW, I am thinking along the lines of STC ( suffix-tree based clustering ). -Prasen On Sat, Nov 21, 2009 at 1:26 AM, Patterson, Josh <[email protected]> wrote: > I think in terms of clustering time series data, the first step looks to > be vectorizing the input cases with possibly the DenseVector class and > feeding that to a basic KMeans implementation like KMeansDriver.java. > Once we can get the basic kmeans rolling with some known dataset we'll > be able to iterate on that and move towards using more complex > techniques and other grid timeseries data. Any suggestions or discussion > is greatly appreciated, > > Josh Patterson > TVA
