Re: Output directory already exists

Karl Anderson Wed, 03 Sep 2008 10:41:10 -0700

On Wed, Sep 3, 2008 at 1:24 AM, Shirley Cohen <[EMAIL PROTECTED]>wrote:

Hi,
I'm trying to write the output of two different map-reduce jobs intothesame output directory. I'm using MultipleOutputFormats to set thefilename
dynamically, so there is no filename collision between the two jobs.
However, I'm getting the error "output directory already exists".
Does the framework support this functionality? It seems silly tohave tocreate a temp directory to store the output files from the secondjob andthen have to copy them to the first job's output directory after thesecond
job completes.

You basically have to work with the framework. So far, when I've hadto sort, split, combine, etc. my data, I put another job in mypipeline to shuffle data around, then worry about efficiency later.This one could be done with two input directories and a nop mapperlike IdentityMapper or cat.





Karl Anderson
[EMAIL PROTECTED]
http://monkey.org/~kra

Re: Output directory already exists

Reply via email to