Thanks for the suggestions Chris.

Unfortunately, the memory that was exhausted was the OldGen heap area, not 
PermGen, which doesn't show up in the Catalina log.

The heap allocation is quite hefty as this is a 64-bit environment...we need to 
get our developers to look into the application behaviour, but in the meantime 
I was looking for a way of dealing with the problem.

-----Original Message-----
From: Christopher Schultz [mailto:ch...@christopherschultz.net] 
Sent: 24 September 2009 19:59
To: Tomcat Users List
Subject: Re: Clustering Question...

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Darren,

On 9/24/2009 10:21 AM, Darren Kukulka wrote:
> In a 2-node scenario, where both nodes are configured identically and
> load balanced via Apache based on availability, how can we configure the
> cluster to deal with situations where one node has exhausted its Old Gen
> heap allocation.

Hmm... this is a situations that is difficult to detect using remote
code (like mod_jk).

> In such situations we've observed that the application being served by
> the cluster slow down considerably.  I can understand why this would be
> the case for sessions on the degraded node, but why would sessions on
> the good node suffer?

Are you using session replication? If so, the "good" Tomcat may be
slowing down attempting to replicate session changes to the "damaged"
Tomcat that is either not responding, or responding slowly, or
responding in confusing ways.

> How can we modify our configuration to deal with such occurrences more
> effectively?

After we had some trouble with OOMEs in production (legit ones,
actually: we just needed more heap), I implemented a quick-and-dirty
OOME checker. All it does is "grep OutOfMemoryError catalina.out" and,
if found, sends an email to someone.

Instead of emailing, you could have your OOME checker actually shut down
(or forceably terminate) the damaged Tomcat, and then the cluster should
stabilize. With only two nodes, this might be a problem, as the good
Tomcat will take over and might, under the new load of 100% of your
traffic, experience its own pergmen exhaustion and also be shut down.

You should consider adjusting your pergmen allocation (duh!) as well as
perhaps your heap allocation as well.

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkq7wXcACgkQ9CaO5/Lv0PDVxACfT6X4tsFOEZ0nRWpYOIfLr7lX
XMIAoJUEs5uW3tTLqeRB5wCf1bo0oi4Q
=4LWQ
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Connaught plc is a FTSE 250 company. We are the UK's leading provider of 
integrated services operating in the compliance, social housing and public 
sector markets.

Please visit our website to see a full list of Connaught's Registered Companies 
www.connaught.plc.uk/group/aboutconnaught/registeredcompanies

Disclaimer:

The information transmitted is intended only for the person or entity to which 
it is addressed and may contain confidential and/or privileged material. Any 
review, retransmission, dissemination or other use of, or taking of any action 
in reliance upon, this information by persons or entities other than the 
intended recipient is prohibited. If you received this in error, please contact 
the sender and delete this message.

Connaught plc, Head Office 01392 444546

Reply via email to