Re: MR output to a file instead of directory?

2012-03-04 Thread Arun C Murthy
I'm not sure about the usecase, but if you really care you can use an existing 
directory (e.g. /) by writing a bit of code to bypass the check for output-dir 
existence...

By default FIleOutputFormat assumes the output-dir shouldn't exist and will 
error out during init if it does. You could customize it to not bother to check.

Arun

On Mar 2, 2012, at 4:38 PM, Jianhui Zhang wrote:

> Hi all,
> 
> The FileOutputFormat/FileOutputCommitter always treats an output path
> as a directory and write files under it, even if there is only one
> Reducer. Is there any way to configure an OutputFormat to write all
> data into a file?
> 
> Thanks,
> James

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/




Re: MR output to a file instead of directory?

2012-03-03 Thread Harsh J
James,

This is _possible_, but you will need a complete set of both
OutputFormat and OutputCommitter to do the work for you as
File{OutputFormat,OutputCommitter} work with directories. The biggest
advantage of having output directories is the ability to have
temporary attempt directories and output-committing (speculative
execution and task failure handling), described at
http://wiki.apache.org/hadoop/FAQ#Can_I_write_create.2BAC8-write-to_hdfs_files_directly_from_map.2BAC8-reduce_tasks.3F.
-- You'd need something like this for a complete solution.

On Sat, Mar 3, 2012 at 6:08 AM, Jianhui Zhang  wrote:
> Hi all,
>
> The FileOutputFormat/FileOutputCommitter always treats an output path
> as a directory and write files under it, even if there is only one
> Reducer. Is there any way to configure an OutputFormat to write all
> data into a file?
>
> Thanks,
> James



-- 
Harsh J


MR output to a file instead of directory?

2012-03-02 Thread Jianhui Zhang
Hi all,

The FileOutputFormat/FileOutputCommitter always treats an output path
as a directory and write files under it, even if there is only one
Reducer. Is there any way to configure an OutputFormat to write all
data into a file?

Thanks,
James