> -----Original Message----- > From: Todd Lipcon [mailto:[email protected]] > Sent: Friday, December 17, 2010 19:23 > To: [email protected] > Subject: Re: Simple OOM crash? > > On Fri, Dec 17, 2010 at 2:32 PM, Sandy Pratt <[email protected]> wrote: > > Todd, > > > > While we're on the subject, and since you seem to know LZO well, can you > answer a few questions that have been playing around in my mind lately? > > > > 1) Does GZ also use the Direct Memory Buffer like LZO does? > > I don't know much about the gzip codec, but I believe so long as you're using > the native one (ie have the hadoop native libraries > installed) it is very similar, yes.
That makes sense. > > 2) What size to you run with for that buffer? I kicked it up to 512m the > other day and I haven't seen problems but I wonder if that's overkill. > > Which buffer are you referring to? I don't do any particular tuning for the > LZO codec. I do usually set io.file.buffer.size to 128KB in Hadoop, but that's > at a different layer. I was referring to this guy: -XX:MaxDirectMemorySize=100m Which I believe is used for memory mapping with java.nio. Sounds like it won't be an issue with a leak-free lzo lib. > > 3) How do you think LZO memory use compares to GZ? The reason I ask is > because ISTR reading that GZ is very light on memory. If it's significantly > lighter than LZO, it might be worth my while to use GZ instead, even though > it's slower than LZO, and use the freed memory to allocate another map > slot. > > > > All the LZO buffers are pooled and pretty transient so long as there isn't a > leak (like the bug you hit). Without a leak it should be responsible for <1M > of memory usage, in my experience. That makes sense. If we're not allocating hundreds of megs or anything, it shouldn't make a difference. Thanks again for all your help. Sandy
