Re: LDA and utils.vectors.TermEntry

Grant Ingersoll Wed, 23 Sep 2009 14:33:25 -0700

The term entries are used to map the text to a position in theVector. So, the readDictionary is just loading up that mapping suchthat when it examines the vector it can print out that term 14534 isreally "foobar", or whatever.

There may be an abstraction to be made here, but I'd have to dig alittle deeper into the code to say for sure.



On Sep 23, 2009, at 4:58 PM, Jack Tanner wrote:

The TermEntry constructor is (String term, int termIdx, intdocFreq). What's the point of termIdx? I see that it gets used foran assert in LDAPrintTopics.java:readDictionary() , but it seemsredundant otherwise.(Background: I'd like to generate vectors for LDA directly,bypassing Lucene. Following o.a.m.utils.vectors.lucene.Driver, I seethat I need to generate a dictionary file for the "printing out topterms per topic" step. This uses TermInfo, which contains lots ofTermEntry elements.)
_________________________________________________________________
Bing™ brings you maps, menus, and reviews organized in one place.Try it now.
http://www.bing.com/search?q=restaurants&form=MLOGEN&publ=WLHMTAG&crea=TEXT_MLOGEN_Core_tagline_local_1x1


--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)using Solr/Lucene:

http://www.lucidimagination.com/search

Re: LDA and utils.vectors.TermEntry

Reply via email to