I have verifed the results only by "laugh-test" method. Many of the clusters were excellent. There were some false-positives though, which were farther from the cetroid. It might be because I used 4 iterations. Higher number of iterations probably will give better results.
Right now, I don't have any visualization tools to make a confident statement about quality of clusters. I will report back when I have something concrete. --shashi On Thu, Jun 18, 2009 at 12:16 AM, Ted Dunning<[email protected]> wrote: > Shashi, > > What were the results for k-means? > > (I have zero experience with canopy, but have generally had mildly useful > results using k-means clustering. > > On Wed, Jun 17, 2009 at 7:34 AM, Shashikant Kore <[email protected]>wrote: > >> I ran Canopy and then K-Means on 50k doc vectors >> > > > > -- > Ted Dunning, CTO > DeepDyve >
