On Aug 12, 2013, at 5:12 PM, William Moran <echofo...@gmail.com> wrote:
> Hi, > > What exactly are the numbers next to these terms? (this is an example > clusterdump from the Mahout in Action book, but my clusterdumps look > similar). They are the weights assigned to each of the terms. They are likely the TF/IDF values, but I believe they may be other things depending on how your dictionary/vectors were created. > > Top Terms: > > Shania Twain => 1.126984126984127 > Garth Brooks => 0.746031746031746 > Sara Evans => 0.6031746031746031 > Lonestar => 0.5238095238095238 > > Sorry if this is an obvious question but I find it hard to find details on > these specifics. > > Many thanks, > > Will -------------------------------------------- Grant Ingersoll | @gsingers http://www.lucidworks.com