To clarify:
 
     static class TestOutputFormat
         implements OutputFormat <Text, Text>
     {
         static class TestRecordWriter
             implements RecordWriter <Text, Text>
         {
             TestOutputFormat output;
             
             public TestRecordWriter (TestOutputFormat output, 
org.apache.hadoop.fs.FileSystem ignored, JobConf job, String name, Progressable 
progress)
             {
                 this.output = output;
             }
             
             public void close (Reporter reporter)
             {}
             
             public void write (Text key, Text value)
             {
                 output.addResults (value.toString ());
             }
         }
         
         protected String results = "";
                 
         public void checkOutputSpecs (org.apache.hadoop.fs.FileSystem ignored, 
JobConf job)
             throws IOException
         {}
         
         public RecordWriter <Text, Text> getRecordWriter 
(org.apache.hadoop.fs.FileSystem ignored, JobConf job, String name, 
Progressable progress)
         {
             return new TestRecordWriter (this, ignored, job, name, progress);
         }
         
         public void addResults (String r)
         {
             results += r + ",";
         }
         
         public String getResults ()
         {
             return results;
         }
     }

 And then running the task:
 public int run(String[] args) 
         throws Exception 
     {
     ....
     JobClient.runJob(job);
         
     // getOutputFormatcreates a new instance of the outputformat. I want to 
get the instance of the output format that the reduce function wrote to
 // The recordWriter that reduce wrote to would be just as good
         TestOutputFormat results = (TestOutputFormat) job.getOutputFormat ();  
   
 // Always prints the empty string, not the populated results
         System.out.println ("results: " + results.getResults ());   
         
         return 0;
     }

Derek Shaw <[EMAIL PROTECTED]> wrote: Date: Tue, 6 May 2008 23:26:30 -0400 (EDT)
From: Derek Shaw <[EMAIL PROTECTED]>
Subject: Collecting output not to file
To: core-user@hadoop.apache.org

 Hey,

>From the examples that I have seen thus far, all of the results from the 
>reduce function are being written to a file. Instead of writing results to a 
>file, I want to store them and inspect them after the job is completed. (I 
>think that I need to implement my own OutputCollector, but I don't know how to 
>tell hadoop to use it.) How can I do this?

-Derek

Reply via email to