I run the KMeansDriver in Java than rather the command line. The problem I am having is that I want to use the relative paths as specified in the Hadoop configuration file and write it to hdfs than rather the local filesystem, I don't want to hard-code the values as the username for production and development are different. However, it works for the Canopy Driver. Does anyone know what could be the cause of this? Any help is much appreciated.
I'm trying to get the following to work: KMeansDriver kMeansDriver = new KMeansDriver(); kMeansDriver.setConf(conf); kMeansDriver.run(vectorsDir, clusterIn, clusterOut, new TanimotoDistanceMeasure(), 0.05, 100, true, 0.05, false); OR KMeansDriver.run(conf, vectorsDir, clusterIn, clusterOut, new TanimotoDistanceMeasure(), 0.05, 100, true, 0.05, false); However this works but is 'hard-coded': KMeansDriver.run(new Path( "hdfs:localhost:9000/user/username/out/tfidf-vectors"), new Path("hdfs:localhost:9000/user/username/out/canopy-centroids/clusters-0-final"), new Path("hdfs:localhost:9000/user/username/out/clusters"), new TanimotoDistanceMeasure(), 0.05, 100, true, 0.05, false); whereby conf refers to the hadoop configuration. This works for the CanopyDriver though: CanopyDriver.run(conf, vectorsDir, canopyCentroids, new EuclideanDistanceMeasure(), 500, 200, false, 0.05, false); Thanks, Herb