Hey Todd,

Hopefully I can get to this somewhere next week. We have had our NN corrupted, 
so we are rebuilding the prod cluster, meaning we use dev for backing our apps 
now, so I have no environment to give it a go. Stay tuned...

>> Yea, you're definitely on the right track. Have you considered systems
>> programming, Friso? :)

> 

Well, at least then you get to do your own memory management most of the time...


Friso



> Can someone who is having this issue try checking out the following git
> branch and rebuilding LZO?
> 
> https://github.com/toddlipcon/hadoop-lzo/tree/realloc
> 
> This definitely stems one leak of a 64KB directbuffer on every reinit.
> 
> -Todd
> 
> On Wed, Jan 12, 2011 at 2:12 PM, Todd Lipcon <[email protected]> wrote:
> 
>> Yea, you're definitely on the right track. Have you considered systems
>> programming, Friso? :)
>> 
>> Hopefully should have a candidate patch to LZO later today.
>> 
>> -Todd
>> 
>> On Wed, Jan 12, 2011 at 1:20 PM, Friso van Vollenhoven <
>> [email protected]> wrote:
>> 
>>> Hi,
>>> My guess is indeed that it has to do with using the reinit() method on
>>> compressors and making them long lived instead of throwaway together with
>>> the LZO implementation of reinit(), which magically causes NIO buffer
>>> objects not to be finalized and as a result not release their native
>>> allocations. It's just theory and I haven't had the time to properly verify
>>> this (unfortunately, I spend most of my time writing application code), but
>>> Todd said he will be looking into it further. I browsed the LZO code to see
>>> what was going on there, but with my limited knowledge of the HBase code it
>>> would be bald to say that this is for sure the case. It would be my first
>>> direction of investigation. I would add some logging to the LZO code where
>>> new direct byte buffers are created to log how often that happens and what
>>> size they are and then redo the workload that shows the leak. Together with
>>> some profiling you should be able to see how long it takes for these get
>>> finalized.
>>> 
>>> Cheers,
>>> Friso
>>> 
>>> 
>>> 
>>> On 12 jan 2011, at 20:08, Stack wrote:
>>> 
>>>> 2011/1/12 Friso van Vollenhoven <[email protected]>:
>>>>> No, I haven't. But the Hadoop (mapreduce) LZO compression is not the
>>> problem. Compressing the map output using LZO works just fine. The problem
>>> is HBase LZO compression. The region server process is the one with the
>>> memory leak...
>>>>> 
>>>> 
>>>> (Sorry for dumb question Friso) But HBase is leaking because we make
>>>> use of the Compression API in a manner that produces leaks?
>>>> Thanks,
>>>> St.Ack
>>> 
>>> 
>> 
>> 
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
>> 
> 
> 
> 
> -- 
> Todd Lipcon
> Software Engineer, Cloudera

Reply via email to