The value k will dictate how many topics are output.
There should be no more or less than that.
In your cvb output there should be as many (term | topic) distributions as
there are topics and in the (document | topic distributions) there should as
many vectors as there are documents.
i.e.
On a related I note I believe I have found a bug in the cvb implementation and
wish to know how to go about getting it fixed. How do I go about doing this?
Sent from my iPad
On 31 Jan 2013, at 02:50, Andy Schlaikjer andrew.schlaik...@gmail.com wrote:
I assume you mean input *matrix* with
So the bug I found results in the document topic model being trained on a
random matrix as opposed to the final (term|topic probability) distributions.
Unless a bug fix has been released this happens in all cases. At least for me.
The result of which is a random (document|topic) model, with more