On Mon, Mar 31, 2008 at 12:49 PM, Rainer Jung <[EMAIL PROTECTED]> wrote:
>  First to make sure: counting objects in general only makes sense after a
>  full GC. Otherwise the heap dump will contain garbage too.

Yes, I made sure the objects I was looking at had a valid GC
reference. They really were getting stuck in the queue.

>  Just some basic info: the LinkObject objects can be either in a
>  FastQueue, or they are used in a FastAsyncSocketSender directly after
>  removing from the FastQueue and before actually sending.
<snip>

Thank you for the detailed description on how the Queues work with the cluster.

>  Why you had that many LinkObjects is not clear. You could first try to
>  check, if the LinkObjects actually belong to a Queue, or not (e.g. then
>  they are already in the Sender). Have a look at your log files, if there
>  are errors or unexpected cluster membership messages.

One problem I've intermittently had with clustering is that after a
Tomcat restart (we shut down one node and it immediately restarts,
generally within 30 seconds), they two nodes don't consistently sync
up. (The restarted node would not have the sessions from the other
node, but new sessions would get replicated over) I have to think that
this may be related to this issue.

I checked the logs and didn't see any issues in the Tomcat logs with
members dropping from the cluster until the JVM got close to running
out of memory and performing a lot of full GCs - when examing the
dump, the vast majority of space in the heap (600+MB out of 1GB) was
with byte arrays referenced by LinkObjects.

>  In general I would suggest to not use the waitForAck feature. That's not
>  a strict rule, but if you do async replication and use session
>  stickyness for load balancing, then you usually put a strong focus on
>  the replication not influencing your webapp negatively. Activating
>  waitForAck lets you realize more reliably, if there is a replication
>  problem, but it also increases the overhead. You mileage may vary.

So what would cause the FastQueue to accumulate ClusterData even when
the cluster is apparently running properly? Is there any failsafe
(besides setting a maximum queuesize) to allow old data to be purged?
I mean, 600k ClusterData objects is a lot!

-Dave

---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to