Hi, Joined a bit late this discussion, but, what about the perplexity measure as reported on section 7.1. of Blei's LDA paper. it seems to be the metric which is commonly used to obtain the best value of "k" (topics) when training a LDA model.
bests, Federico 2011/1/4 Jake Mannix <[email protected]> > Saying we have hashing is different than saying we know what will happen to > an algorithm once its running over hashed features (as the continuing work > on our Stochastic SVD demonstrates). > > I can certainly try to run LDA over a hashed vector set, but I'm not sure > what criteria for correctness / quality of the topic model I should use if > I > do. > > -jake > > On Jan 4, 2011 7:21 AM, "Robin Anil" <[email protected]> wrote: > > We already have the second part - the hashing trick. Thanks to Ted, and he > has a mechanism to partially reverse engineer the feature as well. You > might > be able to drop it directly in the job itself or even vectorize and then > run > LDA. > > Robin > > On Tue, Jan 4, 2011 at 8:44 PM, Jake Mannix <[email protected]> wrote: > > > Hey Robin, > > Vowp... >
