Re: problem with LZO compressor on write only loads

Todd Lipcon Mon, 03 Jan 2011 16:55:38 -0800

Fishy. Are your cells particularly large? Or have you tuned the HFile block
size at all?


-Todd

On Mon, Jan 3, 2011 at 2:15 PM, Friso van Vollenhoven <
[email protected]> wrote:

> I tried it, but it doesn't seem to help. The RS processes grow to 30Gb in
> minutes after the job started.
>
> Any ideas?
>
>
> Friso
>
>
>
> On 3 jan 2011, at 19:18, Todd Lipcon wrote:
>
> > Hi Friso,
> >
> > Which OS are you running? Particularly, which version of glibc?
> >
> > Can you try running with the environment variable MALLOC_ARENA_MAX=1 set?
> >
> > Thanks
> > -Todd
> >
> > On Mon, Jan 3, 2011 at 8:15 AM, Friso van Vollenhoven <
> > [email protected]> wrote:
> >
> >> Hi all,
> >>
> >> I seem to run into a problem that occurs when using LZO compression on a
> >> heavy write only load. I am using 0.90 RC1 and, thus, the LZO compressor
> >> code that supports the reinit() method (from Kevin Weil's github,
> version
> >> 0.4.8). There are some more Hadoop LZO incarnations, so I am pointing my
> >> question to this list.
> >>
> >> It looks like the compressor uses direct byte buffers to store the
> original
> >> and compressed bytes in memory, so the native code can work with it
> without
> >> the JVM having to copy anything around. The direct buffers are possibly
> >> reused after a reinit() call, but will often be newly created in the
> init()
> >> method, because the existing buffer can be the wrong size for reusing.
> The
> >> latter case will leave the previously used buffers by the compressor
> >> instance eligible for garbage collection. I think the problem is that
> this
> >> collection never occurs (in time), because the GC does not consider it
> >> necessary yet. The GC does not know about the native heap and based on
> the
> >> state of the JVM heap, there is no reason to finalize these objects yet.
> >> However, direct byte buffers are only freed in the finalizer, so the
> native
> >> heap keeps growing. On write only loads, a full GC will rarely happen,
> >> because the max heap will not grow far beyond the mem stores (no block
> cache
> >> is used). So what happens is that the machine starts using swap before
> the
> >> GC will ever clean up the direct byte buffers. I am guessing that
> without
> >> the reinit() support, the buffers were collected earlier because the
> >> referring objects would also be collected every now and then or things
> would
> >> perhaps just never promote to an older generation.
> >>
> >> When I do a pmap on a running RS after it has grown to some 40Gb
> resident
> >> size (with a 16Gb heap), it will show a lot of near 64M anon blocks
> >> (presumably native heap). I show this before with the 0.4.6 version of
> >> Hadoop LZO, but that was under normal load. After that I went back to a
> >> HBase version that does not require the reinit(). Now I am on 0.90 with
> the
> >> new LZO, but never did a heavy load like this one with that, until
> now...
> >>
> >> Can anyone with a better understanding of the LZO code confirm that the
> >> above could be the case? If so, would it be possible to change the LZO
> >> compressor (and decompressor) to use maybe just one fixed size buffer
> (they
> >> all appear near 64M anyway) or possibly reuse an existing buffer also
> when
> >> it is not the exact required size but just large enough to make do?
> Having
> >> short lived direct byte buffers is apparently a discouraged practice. If
> >> anyone can provide some pointers on what to look out for, I could invest
> >> some time in creating a patch.
> >>
> >>
> >> Thanks,
> >> Friso
> >>
> >>
> >
> >
> > --
> > Todd Lipcon
> > Software Engineer, Cloudera
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera

Re: problem with LZO compressor on write only loads

Reply via email to