I have seen the implementation of L-LDA using Java, Stanford Topic Modeling Toolbox <http://nlp.stanford.edu/software/tmt/> Does any one know whether they provide the source code or not?
Thanks, Maxim On Fri, Oct 16, 2009 at 12:39 PM, David Hall <d...@cs.berkeley.edu> wrote: > Sorry, this slipped out of my inbox and I just found it! > > On Thu, Oct 8, 2009 at 12:05 PM, Robin Anil <robin.a...@gmail.com> wrote: > > Posting to the dev list. > > Great Paper Thanks!. Looks like L-LDA could be used to create some > > interesting examples. > > Thanks! > > > The Paper shows L-LDA could be used to creating word-tag model for > accurate > > tag(s) prediction given a document of words. I will complete reading and > > tell > > How much work is need to transform/build on top of current LDA > > implementation to L-LDA. any thoughts? > > Umm, cool! In the paper we used Gibbs sampling to do the inference, > and the implementation in Mahout uses variational inference (because > it distributes better). I don't see any obvious problems in terms of > math, and so the rest is just fitting it in the system. > > I think a small amount of refactoring would be in order to make things > more generic, and then it shouldn't be too hard to plug in. I'll add > it to my list, but I'm swamped for quite some time. > > -- David > > > Robin > > On Thu, Oct 8, 2009 at 11:50 PM, David Hall <d...@cs.berkeley.edu> > wrote: > >> > >> The short answer is, that it probably won't help all that much. Naive > >> Bayes is unreasonably good when you have enough data. > >> > >> The long answer is, I have a paper with Dan Ramage and Ramesh > >> Nallapati that talks about how to do it. > >> > >> www.aclweb.org/anthology-new/D/D09/D09-1026.pdf > >> > >> In some sense, "Labeled-LDA" is a kind of Naive Bayes where you can > >> have more than one class per document. If you have exactly one class > >> per document, then LDA reduces to Naive Bayes (or the unsupervised > >> variant of naive bayes which is basically k-means in multinomial > >> space). If instead you wanted to project W words to K topics, with K > > >> numWords, then there is something to do... > >> > >> That something is something like: > >> > >> 1) get p(topic|word,document) for each word in each document (which is > >> output by LDAInference). Those are your expected counts for each > >> topic. > >> > >> 2)For each class, do something like: > >> p(topic|class) \propto \sum_{document with that class,word} > >> p(topic|word,document) > >> > >> Then just apply bayes rule to do classification: > >> > >> p(class|topics,document) \propto p(class) \prod p(topic|class,document) > >> > >> -- David > >> > >> On Thu, Oct 8, 2009 at 11:07 AM, Robin Anil <robin.a...@gmail.com> > wrote: > >> > Thanks. Didnt see that, Fixed it!. > >> > I have a query > >> > How is the LDA topic model used to improve a classifier. Say Naive > >> > Bayes? If > >> > its possible, then I would like to integrate it into mahout. > >> > Given m classes and the associated documents, One can build m topic > >> > models > >> > right. (set of topics(words) under each label and the associated > >> > probability > >> > distribution of words). > >> > How can i use that info weight the most relevant topic of a class ? > >> > > >> > > >> > >> >> LDA has two meanings: linear discriminant analysis and latent > >> >> dirichlet allocation. My code is the latter. The former is a kind of > >> >> classification. You say linear discriminant analysis in the outline. > >> >> > >> > > > > > -- ------------------------------------------------------------- Zhen-Dong Zhao (Maxim) <><<><><><><><><><>><><><><><>>>>>> Department of Computer Science School of Computing National University of Singapore ><><><><><><><><><><><><><><><><<<< Homepage:http://zhaozhendong.googlepages.com Mail: zhaozhend...@gmail.com >>>>>>><><><><><><><><<><>><><<<<<<