Shannon, Also the existing Spectral KMeans still refers to deprecated DistributedLanczosSolver and EigenVerificationJob. It would be nice to fix that for 0.9 release.
On Thursday, November 21, 2013 4:03 PM, Suneel Marthi <suneel_mar...@yahoo.com> wrote: On #2, it would be good if could add Spectral KMeans to examples/bin/cluster-reuters.sh to process Reuters dataset. On Thursday, November 21, 2013 3:50 PM, Shannon Quinn <squ...@gatech.edu> wrote: Excellent. My todo list, then: 1: post docs for the algorithm on the Apache CMS 2: create an example to demonstrate how to use it 3: code a job to process raw input into a similarity matrix (will create a JIRA for it) I have a question for #3 that can be a separate thread; mainly, what are the primary input formats I should be concerned with processing? On 11/21/13, 1:09 PM, Isabel Drost-Fromm wrote: > On Thu, 21 Nov 2013 09:42:28 -0800 (PST) > Suneel Marthi <suneel_mar...@yahoo.com> wrote: > >> We are missing wiki docs for both Streaming kmeans and Spectral clustering. >> >> I can pull something together for streaming kmeans. >> >> Speaking of which we need to add a wiki page for Ted's t-digest once we figure out how it plays into Mahout (maybe as a measure of Streaming kmeans clustering, Ted??). > Given that we are in the process of migrating substantial parts of our wiki > to the main website soon to be hosted in Apache CMS it would be great if you > could add your content there. See also MAHOUT-1245 and > http://markmail.org/thread/5ixlclhlh3acgcoq for some details. > > Isabel