To clarify: static class TestOutputFormat implements OutputFormat <Text, Text> { static class TestRecordWriter implements RecordWriter <Text, Text> { TestOutputFormat output; public TestRecordWriter (TestOutputFormat output, org.apache.hadoop.fs.FileSystem ignored, JobConf job, String name, Progressable progress) { this.output = output; } public void close (Reporter reporter) {} public void write (Text key, Text value) { output.addResults (value.toString ()); } } protected String results = ""; public void checkOutputSpecs (org.apache.hadoop.fs.FileSystem ignored, JobConf job) throws IOException {} public RecordWriter <Text, Text> getRecordWriter (org.apache.hadoop.fs.FileSystem ignored, JobConf job, String name, Progressable progress) { return new TestRecordWriter (this, ignored, job, name, progress); } public void addResults (String r) { results += r + ","; } public String getResults () { return results; } }
And then running the task: public int run(String[] args) throws Exception { .... JobClient.runJob(job); // getOutputFormatcreates a new instance of the outputformat. I want to get the instance of the output format that the reduce function wrote to // The recordWriter that reduce wrote to would be just as good TestOutputFormat results = (TestOutputFormat) job.getOutputFormat (); // Always prints the empty string, not the populated results System.out.println ("results: " + results.getResults ()); return 0; } Derek Shaw <[EMAIL PROTECTED]> wrote: Date: Tue, 6 May 2008 23:26:30 -0400 (EDT) From: Derek Shaw <[EMAIL PROTECTED]> Subject: Collecting output not to file To: core-user@hadoop.apache.org Hey, >From the examples that I have seen thus far, all of the results from the >reduce function are being written to a file. Instead of writing results to a >file, I want to store them and inspect them after the job is completed. (I >think that I need to implement my own OutputCollector, but I don't know how to >tell hadoop to use it.) How can I do this? -Derek