Hi all,
I'm using mahout 0.7

I'm trying to use KmeansDriver 
(org.apache.mahout.clustering.kmeans.KMeansDriver) with HDFS and I'm having 
some issues.
When I use it with my local file system everything seems to be working fine.
However, as soon as I change the Configuration object to use HDFS:
        Configuration conf = new Configuration();
        conf.addResource(new 
Path("C:\\hdp-win\\hadoop\\hadoop-1.1.0-SNAPSHOT\\conf\\core-site.xml"));
        conf.addResource(new 
Path("C:\\hdp-win\\hadoop\\hadoop-1.1.0-SNAPSHOT\\conf\\hdfs-site.xml"))


I run into problems

I was looking at the exception I get:


java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
    at java.util.ArrayList.RangeCheck(ArrayList.java:547)
    at java.util.ArrayList.get(ArrayList.java:322)
   at 
org.apache.mahout.clustering.classify.ClusterClassifier.readFromSeqFiles(ClusterClassifier.java:215)


I pulled that code 
(org.apache.mahout.clustering.classify.ClusterClassifier.readFromSeqFiles(ClusterClassifier.java:215)and
 I think is trying to read a file from one of the paths I passed to the method 
but with a new instance of the configuration object (not the configuration 
object I passed to the method but one that doesn't have my HDFS configured)

 


205

  public void  [More ...] readFromSeqFiles(Configuration conf, Path path) 
throws IOException {

206

    Configuration config = new Configuration();

207

    List<Cluster> clusters = Lists.newArrayList();

208

    for (ClusterWritable cw : new 
SequenceFileDirValueIterable<ClusterWritable>(path, PathType.LIST,

209

        PathFilters.logsCRCFilter(), config)) {

210

      Cluster cluster = cw.getValue();

211

      cluster.configure(conf);

212

      clusters.add(cluster);

213

    }

214

    this.models = clusters;

215

    modelClass = models.get(0).getClass().getName();

216

    this.policy = readPolicy(path);

217

  }



any help would be really appreciated :)

Thanks!

Alan

Reply via email to