Thanks for the Luke hint, I will try it out but now I noticed something else
which is very very strange - I ran k-means on 23K+ docs and with 50 clusters
which all seem to be very very strange as top term collection - I would say
for 90% of the top terms I get some words which I barely recognize.
I did a short check and for one particular term, which anyway sounded
strange and which appeared in top terms for 9 of the 50 clusters, I found
that it has "doc freq" = 2 in the Solr dictionary.
How is this even possible - for 23, 000 docs and for a term which is
mentioned only 2 times I have it as a top term in 9 clusters? I definitely
did something wrong, do you have an idea what that could be?

Reply via email to