On Thu, 12 May 2011 09:49:23 -0700 (PDT)
Aman aman_d...@hotmail.com wrote:
The creation of files part-n is atomic. When you run a MR job,
these files are created in directory output_dir/_temporary and
moved to output_dir after the files is closed for writing. This
move is atomic hence as
Hi,
I'm running some experiments using hadoop streaming.
I always get a output_dir/part-0 file at the end, but I wonder:
when exactly will this filename show up? when it's completely written,
or will it already show up while the hapreduce software is still
writing to it? Is the write atomic?
The creation of files part-n is atomic. When you run a MR job, these
files are created in directory output_dir/_temporary and moved to
output_dir after the files is closed for writing. This move is atomic
hence as long as you don't try to read these files from temporary directory
(which I see