Hello Ronald,
Ronald Klop schrieb:
Hello,
I have this on one of my cluster nodes. Is this normal?
We are running Tomcat 5.5.26 on linux 2.6.22.1. On all nodes netstat
gives full send and receive buffers on the tcp connections of the
replication.
"Cluster-FastAsyncSocketSender-6" daemon prio=1 tid=0x08d16af0
nid=0x78d7 in Object.wait() [0x766fe000..0x766ff140]
at java.lang.Object.wait(Native Method)
- waiting on <0x83e2a1b0> (a
org.apache.catalina.cluster.util.SingleRemoveSynchronizedAddLock)
at
org.apache.catalina.cluster.util.SingleRemoveSynchronizedAddLock.lockRemove(SingleRemoveSynchronizedAddLock.java:205)
- locked <0x83e2a1b0> (a
org.apache.catalina.cluster.util.SingleRemoveSynchronizedAddLock)
at org.apache.catalina.cluster.util.FastQueue.remove(FastQueue.java:552)
at
org.apache.catalina.cluster.tcp.FastAsyncSocketSender$FastQueueThread.getQueuedMessage(FastAsyncSocketSender.java:506)
at
org.apache.catalina.cluster.tcp.FastAsyncSocketSender$FastQueueThread.run(FastAsyncSocketSender.java:485)
netstat -n | grep 8015
tcp 78829 46336 10.0.10.91:8015 10.0.10.94:37980
ESTABLISHED
tcp 36957 95568 10.0.10.91:53867 10.0.10.87:8015
ESTABLISHED
tcp 79063 46336 10.0.10.91:8015 10.0.10.88:55031
ESTABLISHED
tcp 36912 95568 10.0.10.91:34266 10.0.10.88:8015
ESTABLISHED
tcp 33282 0 10.0.10.91:34803 10.0.10.95:8015
ESTABLISHED
tcp 78290 46336 10.0.10.91:8015 10.0.10.95:60555
ESTABLISHED
tcp 78618 46336 10.0.10.91:8015 10.0.10.87:57796
ESTABLISHED
tcp 36930 95568 10.0.10.91:40632 10.0.10.94:8015
ESTABLISHED
Any tips on how to debug/solve this?
If you haven't set a removeWaitTimeout (which you shouldn't) the lock
wait is interrupted every 30 seconds and started again. Waiting for the
lock is usual unless the queue gets filled up. So if your observation
indicates a problem or not should be debugged by looking at the queue
size. You can get the queue size e.g. from the jmxproxy of the manager
webapp (http://myserver:myport/manager/jmxproxy?qry=*:*) or the
JConsole. Look for "FastAsyncSocketSender" and attribute queueSize.
Usually it should be "0" most of the time.
There are more statictics for the replication, so you should be able to
see, if your replication actually is stuck, or if it only can't cope
with the amount of replication needed.
If not: do a couple of thread dumps (kill -QUIT) on one node with pauses
of about 3 seconds in between. The results go to catalina.out. Then have
a look, which threads actually hold the lock, the FastAsyncSocketSender
is waiting for. Are they changing? If not, what are the threads holding
the lock doing? You could also post the dumps or make them available to
look at somewhere.
Ronald.
Regards,
Rainer
---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]