Run this code after the kmeans clustering is done.

I have arranged code so that you can simply use the process method by supplying it the path of clusteredPoints directory inside the output path for clustering, the hadoop fileSystem and Configuration.

  //use clusterId and vector here to write to a local file.

At this line you will get the clusterId and vector. Use it to write to the file.


public void process(Path clusteredPoints, FileSystem fileSystem, Configuration conf){
 FileStatus[] partFiles = getAllClusteredPointPartFiles();
    for (FileStatus partFile : partFiles) {
SequenceFile.Reader clusteredPointsReader = new SequenceFile.Reader(fileSystem, partFile.getPath(),
          conf);
WritableComparable clusterIdAsKey = (WritableComparable) clusteredPointsReader.getKeyClass()
          .newInstance();
Writable vector = (Writable) clusteredPointsReader.getValueClass().newInstance();
      while (clusteredPointsReader.next(clusterIdAsKey, vector)) {
        //use clusterId and vector here to write to a local file.

      }
      clusteredPointsReader.close();
    }
  }
}

private FileStatus[] getAllClusteredPointPartFiles(Path clusteredPoints, FileSystem fileSystem) throws IOException { Path[] partFilePaths = FileUtil.stat2Paths(fileSystem.globStatus(clusteredPoints,
      PathFilters.partFilter()));
FileStatus[] partFileStatuses = fileSystem.listStatus(partFilePaths, PathFilters.partFilter());
    return partFileStatuses;
  }

Paritosh


On 25-11-2011 12:27, Rachana wrote:
Hi Ranjan,

Thank you for your response, but as I am newbee I am kind of confused a bit!
Where should I include this code?
Or should I run this as a seperate program.


Rachana.





-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1411 / Virus Database: 2092/4037 - Release Date: 11/24/11

Reply via email to