You close the zip stream in the close method of the reducer. You will get an error if no data has been written to the stream.
On Sun, Jul 26, 2009 at 9:35 PM, Mark Kerzner <markkerz...@gmail.com> wrote: > Now I am trying to do this: > Open a ZipOutputStream in the static part of the Reducer, such as in > configure(), then keep writing to this stream. I see too potential > problems: > cleanup in case of failure - I saw this discussed - and I don't know when > to > close the stream. > > Thank you, > Mark > > On Sun, Jul 26, 2009 at 10:31 PM, Mark Kerzner <markkerz...@gmail.com > >wrote: > > > Sorry that I still don't get it. > > > > I have only one reducer, so I produce one output file. In that reducer, I > > have a standard line > > output.collect(key, values.next()); > > > > Each values.next() is a file, and I would like to write all of these into > > one zip output. > > > > If I do as suggested > > > > ZipOutputStream zos = new ZipOutputStream( fs.create("Output.zip")); > > > > how does this zos work instead of output? > > > > Thank you, > > Mark > > > > On Fri, Jul 24, 2009 at 9:02 AM, Jason Venner <jason.had...@gmail.com > >wrote: > > > >> I used to write zip files in my reducer, it was very very fast, and > >> pulling > >> the files out of hdfs as also very fast. > >> > >> In part this is because each reducer might need to write 26k individual > >> files, by writing them as a zip file there was only 1 hdfs file. > >> The job ran about 15x faster that way. > >> > >> I don't have the code handy any more but it was something on the order > of > >> ZipOutputStream zos = new ZipOutputStream( fs.create("Output.zip")); > >> where fs is a FileSystem object. > >> > >> On Thu, Jul 23, 2009 at 8:48 PM, Mark Kerzner <markkerz...@gmail.com> > >> wrote: > >> > >> > Thank you, MultipleOutputFormat is sufficient. > >> > Mark > >> > > >> > On Thu, Jul 23, 2009 at 12:24 AM, Amogh Vasekar <am...@yahoo-inc.com> > >> > wrote: > >> > > >> > > Does MultipleOutputFormat suffice? > >> > > > >> > > Cheers! > >> > > Amogh > >> > > > >> > > -----Original Message----- > >> > > From: Mark Kerzner [mailto:markkerz...@gmail.com] > >> > > Sent: Thursday, July 23, 2009 6:24 AM > >> > > To: core-u...@hadoop.apache.org > >> > > Subject: Output of a Reducer as a zip file? > >> > > > >> > > Hi, > >> > > my output consists of a number of binary files, corresponding text > >> files, > >> > > and one descriptor file. Is there a way to for my reducer to produce > a > >> > zip > >> > > of all binary files, another zip of all text ones, and a separate > text > >> > > descriptor? If not, how close to this can I get? For example, I > could > >> > code > >> > > the binary and the text into one text line of an output file, but > then > >> I > >> > > would need some additional processing. > >> > > > >> > > Thank you, > >> > > Mark > >> > > > >> > > >> > >> > >> > >> -- > >> Pro Hadoop, a book to guide you from beginner to hadoop mastery, > >> http://www.amazon.com/dp/1430219424?tag=jewlerymall > >> www.prohadoopbook.com a community for Hadoop Professionals > >> > > > > > -- Pro Hadoop, a book to guide you from beginner to hadoop mastery, http://www.amazon.com/dp/1430219424?tag=jewlerymall www.prohadoopbook.com a community for Hadoop Professionals