RowId creates a matrix and docIndex which r <IntWritable, vectorWritable> and <IntWritable, Text> respectively.
Have u looked at LDAPrintTopics.java ? On Thu, Apr 24, 2014 at 7:32 PM, Mohammed Omer <beancinemat...@gmail.com>wrote: > Good evening all. > > This is my first time working with Mahout, and I'm really excited about > being able to stand on the shoulders of giants, thanks to your hard work on > the project. > > I'm 90% of the way there with my current Mahout project, but that last 10% > is killing me. > > Code is at https://github.com/momer/mahout_difficulties if you want to > skip > my explanation and go right to the commands I ran, etc. > > Using a Lucene index and Mahout's robust CLI, I was able to generate > sequence files; sparse vectors; convert those vector keys to integers; and > as a result, run the CVB/LDA Algorithm. > > This worked great, and I was able to dump out the p(doc|topic) and > p(topic|term) results; but, I'm having a tough time figuring out how to use > the matrix generated by `mahout rowid` to map the documents and their > respective topic-assignments/probabilities back to their original text > vector keys. > > Though I'm typically a Rubyist, and having recently (last weekend) > read/worked through the entirety of Core Java vol 1, I'm pretty comfortable > with Java. I am falling on my face at this last step, though. > > I appreciate the eyes and help! > > Thank you again, > > Mo >