Using the reuters 21578 data set and the cluster_reuters.sh script
VS first ingesting the data into SOLR and then invoking mahout on the SOLR index (clustering on the contents of the field "text") defined as and executing a "similar" command set I get vastly differing results: The lucene / kmeans approach yeids 20 cluster whereas the solr approach yields just one cluster. I'm obviously doing something wrong. Any pointers? Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Clustering-using-Solr-Index-vs-Lucene-Index-Different-Results-tp4036013.html Sent from the Mahout User List mailing list archive at Nabble.com.