The data being used for clustering is coming out of an index created on a bunch of PubMed abstracts. The index is passed through a TFDFMapper using the tf-idf weighting scheme and a points file is generated using the LuceneIterable class. This file is the input file to the KMeansDriver program. The code to perform this is actually same as one given in the util.vectors.lucene.Driver class.
Arshad On Thu, Apr 1, 2010 at 1:55 AM, Ted Dunning <[email protected]> wrote: > Empty clusters are not that uncommon with k-means if you specify too large > a > value for k. > > Arshad, can you say more about what data you are clustering? > > On Wed, Mar 31, 2010 at 6:29 AM, Grant Ingersoll <[email protected] > >wrote: > > > Can you share the parameters you used to get this? Does it happen every > > time? > > > > > > On Mar 29, 2010, at 11:53 PM, Arshad Khan wrote: > > > > > Hello All > > > > > > While using Mahout 0.3 KMeansDriver I am encountering an exception > > > indicating an empty cluster. This happens sometimes while re-running > the > > > clustering on the same data set. Is there a way to prevent this error? > > The > > > exception trace is follows: > > > > > > java.lang.RuntimeException: Error in configuring object > > > at > > > > > > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) > > > at > > > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64) > > > at > > > > > > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) > > > at > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354) > > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) > > > at > > > > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:176) > > > Caused by: java.lang.reflect.InvocationTargetException > > > at sun.reflect.GeneratedMethodAccessor39.invoke(Unknown Source) > > > at > > > > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > > at java.lang.reflect.Method.invoke(Method.java:597) > > > at > > > > > > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88) > > > ... 5 more > > > Caused by: java.lang.RuntimeException: Error in configuring object > > > at > > > > > > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) > > > at > > > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64) > > > at > > > > > > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) > > > at > org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34) > > > ... 9 more > > > Caused by: java.lang.reflect.InvocationTargetException > > > at sun.reflect.GeneratedMethodAccessor39.invoke(Unknown Source) > > > at > > > > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > > at java.lang.reflect.Method.invoke(Method.java:597) > > > at > > > > > > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88) > > > ... 12 more > > > Caused by: java.lang.IllegalStateException: Cluster is empty! > > > at > > > > > > org.apache.mahout.clustering.kmeans.KMeansClusterMapper.configure(KMeansClusterMapper.java:73) > > > ... 16 more > > > > > > Thanks > > > Arshad > > > > -------------------------- > > Grant Ingersoll > > http://www.lucidimagination.com/ > > > > Search the Lucene ecosystem using Solr/Lucene: > > http://www.lucidimagination.com/search > > > > >
