On Wed, Sep 3, 2008 at 1:24 AM, Shirley Cohen <[EMAIL PROTECTED]>
wrote:
Hi,
I'm trying to write the output of two different map-reduce jobs into
the
same output directory. I'm using MultipleOutputFormats to set the
filename
dynamically, so there is no filename collision between the two jobs.
However, I'm getting the error "output directory already exists".
Does the framework support this functionality? It seems silly to
have to
create a temp directory to store the output files from the second
job and
then have to copy them to the first job's output directory after the
second
job completes.
You basically have to work with the framework. So far, when I've had
to sort, split, combine, etc. my data, I put another job in my
pipeline to shuffle data around, then worry about efficiency later.
This one could be done with two input directories and a nop mapper
like IdentityMapper or cat.
Karl Anderson
[EMAIL PROTECTED]
http://monkey.org/~kra