Re: RS memory leak?

Homer Strong Sat, 17 Dec 2011 13:29:31 -0800

@Stack, we tried your suggestion for getting off the ground with an
extra RS. We added 1 more identical RS, and after balancing, killed
the extra one. The cluster remained stable for the night, but this
morning all 3 of our RSs had OOMs.


In the logs we find many entries like

https://gist.github.com/eadb953fcadbeb302143

Followed by the RSs aborting due to OOMs. Could this maybe be subject
to HBASE-4222?

Thanks for your help!


On Fri, Dec 16, 2011 at 3:31 PM, Homer Strong <[email protected]> wrote:
> Thanks for the response! To add to our problem's description: it
> doesn't seem like an absolute number of regions that triggers the
> memory overuse, we've seen it happen now with a wide range of region
> counts.
>
>> Just opening regions, it does this?
> Yes.
>
>> No load?
> Very low load, no requests.
>
>> No swapping?
> Swapping is disabled.
>
>
>> Bring up more xlarge instances and see if gets you off the ground?
>> Then work on getting your number of regions down in number?
> We'll try this and get back in a couple minutes!
>
>
>
> On Fri, Dec 16, 2011 at 3:21 PM, Stack <[email protected]> wrote:
>> On Fri, Dec 16, 2011 at 1:57 PM, Homer Strong <[email protected]> wrote:
>>> Whenever a RS is assigned a large (> 500-600) number of regions, the
>>> heap usage grows without bound. Then the RS constantly GCs and must be
>>> killed.
>>>
>>
>> Just opening regions, it does this?
>>
>> No load?
>>
>> No swapping?
>>
>> What JVM and what args for JVM?
>>
>>
>>> This is with 2000 regions over 3 RSs, with 10 GB heap. RSs have EC2
>>> xlarges. Master is on its own large. Datanodes and namenodes are
>>> adjacent to RSs and master, respectively.
>>>
>>> Looks like a memory leak? Any suggestions would be appreciated.
>>>
>>
>> Bring up more xlarge instances and see if gets you off the ground?
>> Then work on getting your number of regions down in number?
>>
>> St.Ack

Re: RS memory leak?

Reply via email to