Re: extract p(doc|topic) from LDA

Avishay Livne1 Mon, 07 Jun 2010 05:08:16 -0700

I modified
$MAHOUT_HOME/utils/src/main/java/org/apache/mahout/clustering/lda/LDAPrintTopics.java
 so the score is printed along each word., but the interpretation of the
scores is somewhat obscure.
I see values in the range of -8 to +6. I assumed the values should
represent P(word | topic) or  log(P(word | topic)) but these values are of
different range.
How should I interpret these values? Is there a simple way to retrieve P
(word | topic)?


Thanks,
Avishay.


                                                                                
                                                    
  From:       Avishay Livne1/Haifa/i...@ibmil                                   
                                                     
                                                                                
                                                    
  To:         [email protected]                                            
                                                    
                                                                                
                                                    
  Date:       06/06/2010 03:16 PM                                               
                                                    
                                                                                
                                                    
  Subject:    extract p(doc|topic) from LDA                                     
                                                    
                                                                                
                                                    






Hi,

I'm trying to use LDA for a collaborative filtering task, where I need to
predict the rating a user (document) will give to a movie (word).
I ran LDA and constructed T topics, but I can only print the most frequent
words (movies) per topic.
Is it possible to extract p(documet|topic) or p(word|topic) from LDA's
output? (document = new user, word = movie).

Best regards,
Avishay

Re: extract p(doc|topic) from LDA

Reply via email to