On Jan 10, 2010, at 11:18 AM, Grant Ingersoll wrote:

> Continuing my sweep through Mahout's clustering capabilities...
> 
> In LDA, one of the input parameters is --numWords.  I think this is supposed 
> to be the total number of words seen in the collection, right?  Thus, if I 
> dumped Vectors from Lucene, for instance, the --numWords value should be the 
> count of the number of values in the dictionary, right?  

Answering my own question: Yes, the num words should be at least the size of 
the words in the dictionary.
  • LDA Questions Grant Ingersoll
    • Re: LDA Questions Grant Ingersoll

Reply via email to