On Wed, Sep 3, 2008 at 1:24 AM, Shirley Cohen <[EMAIL PROTECTED]> wrote:

Hi,

I'm trying to write the output of two different map-reduce jobs into the same output directory. I'm using MultipleOutputFormats to set the filename
dynamically, so there is no filename collision between the two jobs.
However, I'm getting the error "output directory already exists".

Does the framework support this functionality? It seems silly to have to create a temp directory to store the output files from the second job and then have to copy them to the first job's output directory after the second
job completes.

You basically have to work with the framework. So far, when I've had to sort, split, combine, etc. my data, I put another job in my pipeline to shuffle data around, then worry about efficiency later. This one could be done with two input directories and a nop mapper like IdentityMapper or cat.




Karl Anderson
[EMAIL PROTECTED]
http://monkey.org/~kra



Reply via email to