Re: output files are empty when i turn compression on

Adriana Sbircea Wed, 11 Apr 2012 11:21:43 -0700

I forgot to mention that i use Hadoop in pseudo-distributed mode.

On Wed, Apr 11, 2012 at 8:33 PM, Koert Kuipers <ko...@tresata.com> wrote:


> in case someone else ever runs into this: the issue was that in my reducer
> i used a hadoop FileSystem which i closed after i was done with it.
> apparently one shouldn't close these since they are shared or singletons...
> i used it to open a file from hdfs for a parallel merge sort. i created the
> FileSystem in my configure() method and closed it in my close() method of
> the reducer. bad idea apparently. removing the fs.close() solved the issue.
>
> On Wed, Apr 11, 2012 at 1:02 PM, Koert Kuipers <ko...@tresata.com> wrote:
>
>> i have a simple map-reduce job that i test with only 2 mappers, 2
>> reducers and very small input (10 lines of text).
>>
>> it runs fine without compression. but as soon as i turn on compression
>> (mapred.compress.map.output=true), the output files (part-00000.snappy,
>> etc.) are empty. zero records. using logging i can see that my reducer
>> succesfully calls output.collect(key, value) yet they dont show up in the
>> file. i tried both snappy and gzip. do i need to do some sort of flushing?
>>
>> i am on hadoop 0.20.2
>>
>>
>>
>

Re: output files are empty when i turn compression on

Reply via email to