RE: Demos/Tutorials

Jeff Eastman Tue, 18 Mar 2008 11:55:55 -0700

I've been using the canopy clustering to cluster Apache log time slices by
URL frequency. Typical results indicate several big clusters with the
"business as usual" access patterns in them and then several small clusters
with the unusual patterns. It's a little difficult to interpret beyond that
but still intriguing. Since every body has such logs it might be a useful
demo application that people could run over their own data.


Jeff

> -----Original Message-----
> From: Grant Ingersoll [mailto:[EMAIL PROTECTED]
> Sent: Monday, March 17, 2008 8:41 AM
> To: [email protected]
> Subject: Demos/Tutorials
> 
> Now that we have some code in place for clustering, I think it would
> be cool to put together some examples/demos of real world problems.
> Things like clustering text (perhaps we can use the wikipedia download
> or the reuters download that Lucene contrib/benchmark uses) or
> clustering other pieces of data.
> 
> We could setup a demo area of code and use Lucene's analysis code to
> create document vectors.
> 
> Ideas and/or thoughts or volunteers?
> 
> Cheers,
> Grant

RE: Demos/Tutorials

Reply via email to