Hello Hadoop Core,
I have a very brief question: Our map tasks create side-effect files, in
the directory returned by FileOutputFormat.getWorkOutputPath().
This works fine for the getting the side-effect files that can be
accessed by the reducers.
However, as these map-generated side-effect files are only of use to the
reducers, it would be nice to have them deleted from the output
directory. However, we cant delete them in a reducer.close(), as this
would prevent them being accessible to other reduce tasks (speculative
or otherwise).
Any suggestions, short of deleting them after the job completes?
Craig