Thanks, Owen. This fixed my problem!

Shirley

On Sep 2, 2008, at 8:44 PM, Owen O'Malley wrote:

On Tue, Sep 2, 2008 at 10:24 AM, Shirley Cohen <[EMAIL PROTECTED]> wrote:

Hi,

I'm trying to write the output of two different map-reduce jobs into the same output directory. I'm using MultipleOutputFormats to set the filename
dynamically, so there is no filename collision between the two jobs.
However, I'm getting the error "output directory already exists".


You just need to define a new OutputFormat that derives from the one that you are really using for the second job. For example, if your second job is using TextOutputFormat, you could derive a subtype and have it always return from checkOutputSpec, even if the directory already exists. Something like:

{code}
public class NoClobberTextOutputFormat extends TextOutputFormat {
  RecordWriter<K, V> getRecordWriter(FileSystem ignored, JobConf job,
String name, Progressable progress)
throws IOException {
     return super(ignored, job, name + "-second", progress);
  }
  public void checkOutputSpecs(FileSystem fs, JobConf conf) { }
}
{code}

-- Owen

Reply via email to