I have been testing on the 20 NewsGroups dataset - which the Spark docs
themselves reference. I can confirm that perplexity increases and
likelihood decreases as topics increase - and am similarly confused by
these results.
2017-09-28 10:50 GMT-07:00 Cody Buntain :
> Hi,
Hi, all!
Is there an example somewhere on using LDA’s
logPerplexity()/logLikelihood() functions to evaluate topic counts? The
existing MLLib LDA examples show calling them, but I can’t find any
documentation about how to interpret the outputs. Graphing the outputs for logs
of