Yes. It may help with variable scale. The class technique for dealing with that is to cluster with a small number of clusters at a gross level and then cluster each set of documents that belong to a single large cluster. This automatically adapts to different scales.
The new stuff would greatly facilitate your experimentation. On Sat, May 12, 2012 at 11:19 AM, Pat Ferrel <p...@occamsmachete.com> wrote: > If you are asking about using your post 0.7 clustering, no I haven't yet. > Will it help with varying scale? I assume by scale you mean the density of > docs in certain areas of the vector space? One thing I am trying now is > limiting the subject matter crawled and getting a much larger sample, which > should get me a denser distribution.