We had 4GB head for the region server, on a machine with 8GB that was also
running a data node and a zoo keeper. We have tried with the incremental
garbage collector before, but had problem with a running away heap size,
resulting in swapping. We were/are running with the parallel GC now. When
the session expire problem occurred, we noticed swapping on the node just
before. Therefore, we are a bit afraid to increase heap size more, or to try
to incremental GC again. We are not running in any virtualized environment.

Thanks for the various responses, and the recommendations. I think it would
be nice with an option to automatically restart region server for situations
like this.

TIA,
Peter

On Tue, Mar 30, 2010 at 18:25, Patrick Hunt <ph...@apache.org> wrote:

> Are you running in a virtualized environment by chance? (ec2, vmware,
> etc...) vms, esp oversubscribed/overloaded vms, can result in significant
> io/memory related performance problems.
>
> Patrick
>
>
> Peter Falk wrote:
>
>> Thanks Jean-Daniel. I was not clear about what we have already tried, and
>> we
>> have tried all that you recommend in the updated wiki page, including
>> uppin'
>> the zookeepers session timeout. The node was heavily loaded at the time
>> and
>> it seems the cluster was simply overloaded.
>>
>> However, would it not be possible to automatically start the region server
>> again and let it request new regions? Seems to be dangerous to let region
>> servers die under heavy load like this, and increase the load further on
>> remaining nodes...
>>
>> Sincerely,
>> Peter
>>
>> On Mon, Mar 29, 2010 at 19:38, Jean-Daniel Cryans <jdcry...@apache.org
>> >wrote:
>>
>>  We already had an entry in the wiki for this issue but it wasn't super
>>> explicit about what's happening, so I completely rewrote it using the
>>> logs from this thread. See
>>> http://wiki.apache.org/hadoop/Hbase/Troubleshooting#A9
>>>
>>> Also I created a jira about putting that link directly into the "We
>>> slept Xms, ..." message so that people can get some answers quickly.
>>> See https://issues.apache.org/jira/browse/HBASE-2388
>>>
>>> J-D
>>>
>>>

Reply via email to