Hi guys, I ran CVB on a set of very small text documents (so small I don't expect it to return good results, frequently having 1-2 terms--it was too slow on my larger documents so I just wanted to see if I could get a run to work; the same dataset has reasonable results with kmeans and canopy/kmeans, so there are associations to capture). I dumped the vectors at the end and they seem to all have the same terms: the top 40 out of a dictionary of about 10500. I used default smoothing parameters and asked for 40 topics (why is that the same number of features in the vectors that are output?)
Does anyone know what caused this problem? Output not being what I think it is (top terms and weights for each topic), parameters too small/too large, too many features in the vector, too-small vectors, or some combination?