On Tue, Sep 2, 2008 at 10:24 AM, Shirley Cohen <[EMAIL PROTECTED]> wrote:

> Hi,
>
> I'm trying to write the output of two different map-reduce jobs into the
> same output directory. I'm using MultipleOutputFormats to set the filename
> dynamically, so there is no filename collision between the two jobs.
> However, I'm getting the error "output directory already exists".


You just need to define a new OutputFormat that derives from the one that
you are really using for the second job. For example, if your second job is
using TextOutputFormat, you could derive a subtype and have it always return
from checkOutputSpec, even if the directory already exists. Something like:

{code}
public class NoClobberTextOutputFormat extends TextOutputFormat {
  RecordWriter<K, V> getRecordWriter(FileSystem ignored, JobConf job,
                                     String name, Progressable progress)
throws IOException {
     return super(ignored, job, name + "-second", progress);
  }
  public void checkOutputSpecs(FileSystem fs, JobConf conf) { }
}
{code}

-- Owen

Reply via email to