I forgot to mention that i use Hadoop in pseudo-distributed mode. On Wed, Apr 11, 2012 at 8:33 PM, Koert Kuipers <ko...@tresata.com> wrote:
> in case someone else ever runs into this: the issue was that in my reducer > i used a hadoop FileSystem which i closed after i was done with it. > apparently one shouldn't close these since they are shared or singletons... > i used it to open a file from hdfs for a parallel merge sort. i created the > FileSystem in my configure() method and closed it in my close() method of > the reducer. bad idea apparently. removing the fs.close() solved the issue. > > On Wed, Apr 11, 2012 at 1:02 PM, Koert Kuipers <ko...@tresata.com> wrote: > >> i have a simple map-reduce job that i test with only 2 mappers, 2 >> reducers and very small input (10 lines of text). >> >> it runs fine without compression. but as soon as i turn on compression >> (mapred.compress.map.output=true), the output files (part-00000.snappy, >> etc.) are empty. zero records. using logging i can see that my reducer >> succesfully calls output.collect(key, value) yet they dont show up in the >> file. i tried both snappy and gzip. do i need to do some sort of flushing? >> >> i am on hadoop 0.20.2 >> >> >> >