Excellent. My todo list, then:
1: post docs for the algorithm on the Apache CMS
2: create an example to demonstrate how to use it
3: code a job to process raw input into a similarity matrix (will create
a JIRA for it)
I have a question for #3 that can be a separate thread; mainly, what are
the primary input formats I should be concerned with processing?
On 11/21/13, 1:09 PM, Isabel Drost-Fromm wrote:
On Thu, 21 Nov 2013 09:42:28 -0800 (PST)
Suneel Marthi <suneel_mar...@yahoo.com> wrote:
We are missing wiki docs for both Streaming kmeans and Spectral clustering.
I can pull something together for streaming kmeans.
Speaking of which we need to add a wiki page for Ted's t-digest once we figure
out how it plays into Mahout (maybe as a measure of Streaming kmeans
clustering, Ted??).
Given that we are in the process of migrating substantial parts of our wiki to
the main website soon to be hosted in Apache CMS it would be great if you could
add your content there. See also MAHOUT-1245 and
http://markmail.org/thread/5ixlclhlh3acgcoq for some details.
Isabel