Re: [xmlblaster] Callback message queue fills up
I don't see any routing information even though I know that it is being routed from one node to another. Do I need to turn it on some way? I am looking at the updateQos in the callback, should the route be there? Thanks, David Marcel Ruff wrote: David Robison wrote: I think part of the problem might be that the subscriptions, even when you specify a domain, are not domain specific. What I mean is that a user connected to B subscribes to messages for a domain that is mastered on A. However, when the subscription is forwarded to A, it matches messages from all domains, even those generated on B and sent to A. Does this make sense? Could this be part of the problem? It boils down to the question if the oid and domain are ANDed? (B is slave, A is master of Sport) B-client: subscribe( oid=Hello domain=Sport ) - ends up in A as A is master of Sport A-client: publish( oid=Hello domain= ) - is matched in A and forwarded to B and then to B-client So, as you mentioned, the domain is not ANDed. But i still can't see this as the reason for your filled up callback queue. Note: If a published message is forwarded to another cluster node you will see something like route node id='B' stratum='1' timestamp='119615134316000' dirtyRead='false'/ node id='A' stratum='0' timestamp='119615134316400' dirtyRead='false'/ /route in the publishQos. best regards, Marcel David *From:* David R Robison [mailto:[EMAIL PROTECTED] *To:* xmlblaster@server.xmlBlaster.org *Sent:* Wed, 21 Nov 2007 10:41:10 -0500 *Subject:* Re: [xmlblaster] Callback message queue fills up Here is a dunp of one of the messages: MsgUnit index='0' key oid='DomainHeartbeat-Albemarle911' contentMime='text/xml' contentMimeExtended='1.0' domain='Albemarle911'/ content size='46'Domain Albemarle911 ALIVE at 11/21/07 09:48:43/content qos subscribable/ sender/node/Albemarle911/client/A-NATIVE-CLIENT-PLUGIN/-3/sender priorityMAX/priority subscribe id='__subId:StauntonSTC-XPATH119562846332900'/ expiration lifeTime='3' remainingLife='22703' forceDestroy='true'/ rcvTimestamp nanos='119565948261302'/ queue index='0' size='1'/ persistentfalse/persistent isUpdate/ /qos /MsgUnit The message was created on node B and sent to node A because of a subscription on node A. But it is now in the callback queue on A to go back to B. Also, I have never seen the route data in the messages. Is there a way to turn this on? David Marcel Ruff wrote: David R Robison wrote: One other thought. Heartbeat messages are published on node B and subscribed to by clients on node A. Also, there are clients on node B that subscribe to messages on node A. However, it appears that the subscriptions the clients on node B are using are also matching the heartbeat messages from node B that have been sent to node A. Could I have some kind of circular queue? A message is posted on B then sent to A because a subscription by a client on A. Then sent back to B because of a subscription by a client on B for messages on A. Then the message gets sent back to A and the whole cycle repeats? Could be, usually the cluster should prevent this ... The messages contain in their QoS the nodes traversed: qos senderjoe/sender route node id='bilbo' stratum='2' timestamp='34460239640'/ node id='frodo' stratum='1' timestamp='34460239661'/ node id='heron' stratum='0' timestamp='34460239590'/ /route /qos it would be nice to see the dump of such messages, Use the jconsole or logging output from your receiving client or use the message sniffer, e.g.: java javaclients.simplereader.SimpleReaderGui -xpath //key -session.name simpleReader -passwd secret -protocol SOCKET -dispatch/connection/plugin/socket/hostname 192.168.1.25 -dumpToFile true or peek the callback queue with administrative messages as described in one of your last posts, thanks Marcel Could this be possible? David David R Robison wrote: Thanks, See in line... Marcel Ruff wrote: Hi David, do you have a jconsole to observe the two nodes? I don't have a jconsole, but can I get the same using the admin messages? If yes, please check the number of subscriptions the node A has forwarded to node B (look into node B and check the number of subscriptions of client A) during such a case. In case the subscribeQos has set I will check. multiSubscribetrue/multiSubscribe I believe that we set all to false. (which is the default) it could be that the subscriptions multiplied during small connection errors
Re: [xmlblaster] Callback message queue fills up
Could it be related to the fact that the message is published by a plugin? David David R Robison wrote: I don't see any routing information even though I know that it is being routed from one node to another. Do I need to turn it on some way? I am looking at the updateQos in the callback, should the route be there? Thanks, David Marcel Ruff wrote: David Robison wrote: I think part of the problem might be that the subscriptions, even when you specify a domain, are not domain specific. What I mean is that a user connected to B subscribes to messages for a domain that is mastered on A. However, when the subscription is forwarded to A, it matches messages from all domains, even those generated on B and sent to A. Does this make sense? Could this be part of the problem? It boils down to the question if the oid and domain are ANDed? (B is slave, A is master of Sport) B-client: subscribe( oid=Hello domain=Sport ) - ends up in A as A is master of Sport A-client: publish( oid=Hello domain= ) - is matched in A and forwarded to B and then to B-client So, as you mentioned, the domain is not ANDed. But i still can't see this as the reason for your filled up callback queue. Note: If a published message is forwarded to another cluster node you will see something like route node id='B' stratum='1' timestamp='119615134316000' dirtyRead='false'/ node id='A' stratum='0' timestamp='119615134316400' dirtyRead='false'/ /route in the publishQos. best regards, Marcel David *From:* David R Robison [mailto:[EMAIL PROTECTED] *To:* xmlblaster@server.xmlBlaster.org *Sent:* Wed, 21 Nov 2007 10:41:10 -0500 *Subject:* Re: [xmlblaster] Callback message queue fills up Here is a dunp of one of the messages: MsgUnit index='0' key oid='DomainHeartbeat-Albemarle911' contentMime='text/xml' contentMimeExtended='1.0' domain='Albemarle911'/ content size='46'Domain Albemarle911 ALIVE at 11/21/07 09:48:43/content qos subscribable/ sender/node/Albemarle911/client/A-NATIVE-CLIENT-PLUGIN/-3/sender priorityMAX/priority subscribe id='__subId:StauntonSTC-XPATH119562846332900'/ expiration lifeTime='3' remainingLife='22703' forceDestroy='true'/ rcvTimestamp nanos='119565948261302'/ queue index='0' size='1'/ persistentfalse/persistent isUpdate/ /qos /MsgUnit The message was created on node B and sent to node A because of a subscription on node A. But it is now in the callback queue on A to go back to B. Also, I have never seen the route data in the messages. Is there a way to turn this on? David Marcel Ruff wrote: David R Robison wrote: One other thought. Heartbeat messages are published on node B and subscribed to by clients on node A. Also, there are clients on node B that subscribe to messages on node A. However, it appears that the subscriptions the clients on node B are using are also matching the heartbeat messages from node B that have been sent to node A. Could I have some kind of circular queue? A message is posted on B then sent to A because a subscription by a client on A. Then sent back to B because of a subscription by a client on B for messages on A. Then the message gets sent back to A and the whole cycle repeats? Could be, usually the cluster should prevent this ... The messages contain in their QoS the nodes traversed: qos senderjoe/sender route node id='bilbo' stratum='2' timestamp='34460239640'/ node id='frodo' stratum='1' timestamp='34460239661'/ node id='heron' stratum='0' timestamp='34460239590'/ /route /qos it would be nice to see the dump of such messages, Use the jconsole or logging output from your receiving client or use the message sniffer, e.g.: java javaclients.simplereader.SimpleReaderGui -xpath //key -session.name simpleReader -passwd secret -protocol SOCKET -dispatch/connection/plugin/socket/hostname 192.168.1.25 -dumpToFile true or peek the callback queue with administrative messages as described in one of your last posts, thanks Marcel Could this be possible? David David R Robison wrote: Thanks, See in line... Marcel Ruff wrote: Hi David, do you have a jconsole to observe the two nodes? I don't have a jconsole, but can I get the same using the admin messages? If yes, please check the number of subscriptions the node A has forwarded to node B (look into node B and check the number of subscriptions of client A) during such a case. In case the subscribeQos has set I will check. multiSubscribetrue/multiSubscribe I believe that we set all to false
Re: [xmlblaster] Callback message queue fills up
David Robison wrote: I think part of the problem might be that the subscriptions, even when you specify a domain, are not domain specific. What I mean is that a user connected to B subscribes to messages for a domain that is mastered on A. However, when the subscription is forwarded to A, it matches messages from all domains, even those generated on B and sent to A. Does this make sense? Could this be part of the problem? It boils down to the question if the oid and domain are ANDed? (B is slave, A is master of Sport) B-client: subscribe( oid=Hello domain=Sport ) - ends up in A as A is master of Sport A-client: publish( oid=Hello domain= ) - is matched in A and forwarded to B and then to B-client So, as you mentioned, the domain is not ANDed. But i still can't see this as the reason for your filled up callback queue. Note: If a published message is forwarded to another cluster node you will see something like route node id='B' stratum='1' timestamp='119615134316000' dirtyRead='false'/ node id='A' stratum='0' timestamp='119615134316400' dirtyRead='false'/ /route in the publishQos. best regards, Marcel David *From:* David R Robison [mailto:[EMAIL PROTECTED] *To:* xmlblaster@server.xmlBlaster.org *Sent:* Wed, 21 Nov 2007 10:41:10 -0500 *Subject:* Re: [xmlblaster] Callback message queue fills up Here is a dunp of one of the messages: MsgUnit index='0' key oid='DomainHeartbeat-Albemarle911' contentMime='text/xml' contentMimeExtended='1.0' domain='Albemarle911'/ content size='46'Domain Albemarle911 ALIVE at 11/21/07 09:48:43/content qos subscribable/ sender/node/Albemarle911/client/A-NATIVE-CLIENT-PLUGIN/-3/sender priorityMAX/priority subscribe id='__subId:StauntonSTC-XPATH119562846332900'/ expiration lifeTime='3' remainingLife='22703' forceDestroy='true'/ rcvTimestamp nanos='119565948261302'/ queue index='0' size='1'/ persistentfalse/persistent isUpdate/ /qos /MsgUnit The message was created on node B and sent to node A because of a subscription on node A. But it is now in the callback queue on A to go back to B. Also, I have never seen the route data in the messages. Is there a way to turn this on? David Marcel Ruff wrote: David R Robison wrote: One other thought. Heartbeat messages are published on node B and subscribed to by clients on node A. Also, there are clients on node B that subscribe to messages on node A. However, it appears that the subscriptions the clients on node B are using are also matching the heartbeat messages from node B that have been sent to node A. Could I have some kind of circular queue? A message is posted on B then sent to A because a subscription by a client on A. Then sent back to B because of a subscription by a client on B for messages on A. Then the message gets sent back to A and the whole cycle repeats? Could be, usually the cluster should prevent this ... The messages contain in their QoS the nodes traversed: qos senderjoe/sender route node id='bilbo' stratum='2' timestamp='34460239640'/ node id='frodo' stratum='1' timestamp='34460239661'/ node id='heron' stratum='0' timestamp='34460239590'/ /route /qos it would be nice to see the dump of such messages, Use the jconsole or logging output from your receiving client or use the message sniffer, e.g.: java javaclients.simplereader.SimpleReaderGui -xpath //key -session.name simpleReader -passwd secret -protocol SOCKET -dispatch/connection/plugin/socket/hostname 192.168.1.25 -dumpToFile true or peek the callback queue with administrative messages as described in one of your last posts, thanks Marcel Could this be possible? David David R Robison wrote: Thanks, See in line... Marcel Ruff wrote: Hi David, do you have a jconsole to observe the two nodes? I don't have a jconsole, but can I get the same using the admin messages? If yes, please check the number of subscriptions the node A has forwarded to node B (look into node B and check the number of subscriptions of client A) during such a case. In case the subscribeQos has set I will check. multiSubscribetrue/multiSubscribe I believe that we set all to false. (which is the default) it could be that the subscriptions multiplied during small connection errors and reconnects. This is just a guess. If it is the case please set multiSubscribe to false. Is there a high CPU load during the 1001 message case? No Are the hearbeat messages persistent messages? Yes, but the only live 30
Re: [xmlblaster] Callback message queue fills up
David R Robison wrote: Thanks, See in line... Marcel Ruff wrote: Hi David, do you have a jconsole to observe the two nodes? I don't have a jconsole, but can I get the same using the admin messages? You can, but jconsole will save you (and me :-) a lot of time, - really - try to set up jconsole observation! You need a JDK 1.5 or 1.6 to be installed on your production nodes then you can just fire up the jconsole. This is how I do it: If the production node is a Windows use RDP if it is a UNIX use nomachine (or X or VNC). If you don't have grafical access to the production machines but you have ssh access you can configure to tunnel the jconsole data over ssl and start the jconsole locally on your desktop (no new security hole, just the existing ssh). For ssh i can send you an example setup (private/public key exchange etc). You need to configure the running xmlBlaster to allow jconsole access, see http://www.xmlblaster.org/xmlBlaster/doc/requirements/admin.jmx.html regards Marcel -- Marcel Ruff http://www.xmlBlaster.org
Re: [xmlblaster] Callback message queue fills up
One other thought. Heartbeat messages are published on node B and subscribed to by clients on node A. Also, there are clients on node B that subscribe to messages on node A. However, it appears that the subscriptions the clients on node B are using are also matching the heartbeat messages from node B that have been sent to node A. Could I have some kind of circular queue? A message is posted on B then sent to A because a subscription by a client on A. Then sent back to B because of a subscription by a client on B for messages on A. Then the message gets sent back to A and the whole cycle repeats? Could this be possible? David David R Robison wrote: Thanks, See in line... Marcel Ruff wrote: Hi David, do you have a jconsole to observe the two nodes? I don't have a jconsole, but can I get the same using the admin messages? If yes, please check the number of subscriptions the node A has forwarded to node B (look into node B and check the number of subscriptions of client A) during such a case. In case the subscribeQos has set I will check. multiSubscribetrue/multiSubscribe I believe that we set all to false. (which is the default) it could be that the subscriptions multiplied during small connection errors and reconnects. This is just a guess. If it is the case please set multiSubscribe to false. Is there a high CPU load during the 1001 message case? No Are the hearbeat messages persistent messages? Yes, but the only live 30 seconds. At any given time there should only be at most 2 in the history queue Was the client connected or offline during this message overflow? No, the client was online Does your heartbeat have a unique id so that you can tell for sure if the same No, but the content of the message has a timestamp so I knew they were duplicates published message is cloned many times (try a peek on the callback queue with jconsole)? Can this be done with the admin messages A final option is to use the current svn xmlBlaster and switch on the checkpoint logging to get a better idea what is going on. We will try this in house, unfortunately, the problem nodes are in a production environment. And finally it could be a problem with your client not taking the callback messages. Could be, but what I don't see is the queue gradually growing. Instead, it all-of-a-sudden appears to be full. Another idea: The callback queue contains only a reference on the message. If it expires the message-'meat' is destroyed but the reference remains in the queue until it is looked at during delivery (and then thrown to garbage), Michele, could this be? thanks Marcel David R Robison wrote: We are experiencing something strange in xmlBlaster 1.6.1. We have two nodes, node A subscribes to messages from node B. These are heartbeat messages and are generated every 15 seconds with a lifetime of 30 seconds. A client connects to node A and subscribes to the messages, node A then passes the subscription onto node B. Watching the callback message queue, everything seems to run well, at most 1 message in the queue waiting to be sent. It can run like this for days. Then, unexpectedly, the callback queue will show as being full (in this case 1001 messages). The queue contains many duplicated messages with different timestamps. From there, the server struggles to deliver the messages and keep the queue empty. The reader never seems to read enough messages to get the queue back down to zero. If I stop the client and reconnect, it will recreate its queue and be back to normal. I know this is a bit sketchy, but it is becoming a real problem for us. Any thoughts on what might be the problem? Any idea of where to start looking? One more note, when the client is subscribing to heartbeats that are generated on Node A, the client never fails in this manor, only when it is subscribing to node A for a message generated on node B. Thanks, in advance, David Robison -- David R Robison Open Roads Consulting, Inc. 708 S. Battlefield Blvd., Chesapeake, VA 23322 phone: (757) 546-3401 e-mail: [EMAIL PROTECTED] web: http://openroadsconsulting.com blog: http://therobe.blogspot.com book: http://www.xulonpress.com/book_detail.php?id=2579
Re: [xmlblaster] Callback message queue fills up
David R Robison wrote: One other thought. Heartbeat messages are published on node B and subscribed to by clients on node A. Also, there are clients on node B that subscribe to messages on node A. However, it appears that the subscriptions the clients on node B are using are also matching the heartbeat messages from node B that have been sent to node A. Could I have some kind of circular queue? A message is posted on B then sent to A because a subscription by a client on A. Then sent back to B because of a subscription by a client on B for messages on A. Then the message gets sent back to A and the whole cycle repeats? Could be, usually the cluster should prevent this ... The messages contain in their QoS the nodes traversed: qos senderjoe/sender route node id='bilbo' stratum='2' timestamp='34460239640'/ node id='frodo' stratum='1' timestamp='34460239661'/ node id='heron' stratum='0' timestamp='34460239590'/ /route /qos it would be nice to see the dump of such messages, Use the jconsole or logging output from your receiving client or use the message sniffer, e.g.: java javaclients.simplereader.SimpleReaderGui -xpath //key -session.name simpleReader -passwd secret -protocol SOCKET -dispatch/connection/plugin/socket/hostname 192.168.1.25 -dumpToFile true or peek the callback queue with administrative messages as described in one of your last posts, thanks Marcel Could this be possible? David David R Robison wrote: Thanks, See in line... Marcel Ruff wrote: Hi David, do you have a jconsole to observe the two nodes? I don't have a jconsole, but can I get the same using the admin messages? If yes, please check the number of subscriptions the node A has forwarded to node B (look into node B and check the number of subscriptions of client A) during such a case. In case the subscribeQos has set I will check. multiSubscribetrue/multiSubscribe I believe that we set all to false. (which is the default) it could be that the subscriptions multiplied during small connection errors and reconnects. This is just a guess. If it is the case please set multiSubscribe to false. Is there a high CPU load during the 1001 message case? No Are the hearbeat messages persistent messages? Yes, but the only live 30 seconds. At any given time there should only be at most 2 in the history queue Was the client connected or offline during this message overflow? No, the client was online Does your heartbeat have a unique id so that you can tell for sure if the same No, but the content of the message has a timestamp so I knew they were duplicates published message is cloned many times (try a peek on the callback queue with jconsole)? Can this be done with the admin messages A final option is to use the current svn xmlBlaster and switch on the checkpoint logging to get a better idea what is going on. We will try this in house, unfortunately, the problem nodes are in a production environment. And finally it could be a problem with your client not taking the callback messages. Could be, but what I don't see is the queue gradually growing. Instead, it all-of-a-sudden appears to be full. Another idea: The callback queue contains only a reference on the message. If it expires the message-'meat' is destroyed but the reference remains in the queue until it is looked at during delivery (and then thrown to garbage), Michele, could this be? thanks Marcel David R Robison wrote: We are experiencing something strange in xmlBlaster 1.6.1. We have two nodes, node A subscribes to messages from node B. These are heartbeat messages and are generated every 15 seconds with a lifetime of 30 seconds. A client connects to node A and subscribes to the messages, node A then passes the subscription onto node B. Watching the callback message queue, everything seems to run well, at most 1 message in the queue waiting to be sent. It can run like this for days. Then, unexpectedly, the callback queue will show as being full (in this case 1001 messages). The queue contains many duplicated messages with different timestamps. From there, the server struggles to deliver the messages and keep the queue empty. The reader never seems to read enough messages to get the queue back down to zero. If I stop the client and reconnect, it will recreate its queue and be back to normal. I know this is a bit sketchy, but it is becoming a real problem for us. Any thoughts on what might be the problem? Any idea of where to start looking? One more note, when the client is subscribing to heartbeats that are generated on Node A, the client never fails in this manor, only when it is subscribing to node A for a message generated on node B. Thanks, in advance, David Robison -- Marcel Ruff http://www.xmlBlaster.org
Re: [xmlblaster] Callback message queue fills up
Here is a dunp of one of the messages: MsgUnit index='0' key oid='DomainHeartbeat-Albemarle911' contentMime='text/xml' contentMimeExtended='1.0' domain='Albemarle911'/ content size='46'Domain Albemarle911 ALIVE at 11/21/07 09:48:43/content qos subscribable/ sender/node/Albemarle911/client/A-NATIVE-CLIENT-PLUGIN/-3/sender priorityMAX/priority subscribe id='__subId:StauntonSTC-XPATH119562846332900'/ expiration lifeTime='3' remainingLife='22703' forceDestroy='true'/ rcvTimestamp nanos='119565948261302'/ queue index='0' size='1'/ persistentfalse/persistent isUpdate/ /qos /MsgUnit The message was created on node B and sent to node A because of a subscription on node A. But it is now in the callback queue on A to go back to B. Also, I have never seen the route data in the messages. Is there a way to turn this on? David Marcel Ruff wrote: David R Robison wrote: One other thought. Heartbeat messages are published on node B and subscribed to by clients on node A. Also, there are clients on node B that subscribe to messages on node A. However, it appears that the subscriptions the clients on node B are using are also matching the heartbeat messages from node B that have been sent to node A. Could I have some kind of circular queue? A message is posted on B then sent to A because a subscription by a client on A. Then sent back to B because of a subscription by a client on B for messages on A. Then the message gets sent back to A and the whole cycle repeats? Could be, usually the cluster should prevent this ... The messages contain in their QoS the nodes traversed: qos senderjoe/sender route node id='bilbo' stratum='2' timestamp='34460239640'/ node id='frodo' stratum='1' timestamp='34460239661'/ node id='heron' stratum='0' timestamp='34460239590'/ /route /qos it would be nice to see the dump of such messages, Use the jconsole or logging output from your receiving client or use the message sniffer, e.g.: java javaclients.simplereader.SimpleReaderGui -xpath //key -session.name simpleReader -passwd secret -protocol SOCKET -dispatch/connection/plugin/socket/hostname 192.168.1.25 -dumpToFile true or peek the callback queue with administrative messages as described in one of your last posts, thanks Marcel Could this be possible? David David R Robison wrote: Thanks, See in line... Marcel Ruff wrote: Hi David, do you have a jconsole to observe the two nodes? I don't have a jconsole, but can I get the same using the admin messages? If yes, please check the number of subscriptions the node A has forwarded to node B (look into node B and check the number of subscriptions of client A) during such a case. In case the subscribeQos has set I will check. multiSubscribetrue/multiSubscribe I believe that we set all to false. (which is the default) it could be that the subscriptions multiplied during small connection errors and reconnects. This is just a guess. If it is the case please set multiSubscribe to false. Is there a high CPU load during the 1001 message case? No Are the hearbeat messages persistent messages? Yes, but the only live 30 seconds. At any given time there should only be at most 2 in the history queue Was the client connected or offline during this message overflow? No, the client was online Does your heartbeat have a unique id so that you can tell for sure if the same No, but the content of the message has a timestamp so I knew they were duplicates published message is cloned many times (try a peek on the callback queue with jconsole)? Can this be done with the admin messages A final option is to use the current svn xmlBlaster and switch on the checkpoint logging to get a better idea what is going on. We will try this in house, unfortunately, the problem nodes are in a production environment. And finally it could be a problem with your client not taking the callback messages. Could be, but what I don't see is the queue gradually growing. Instead, it all-of-a-sudden appears to be full. Another idea: The callback queue contains only a reference on the message. If it expires the message-'meat' is destroyed but the reference remains in the queue until it is looked at during delivery (and then thrown to garbage), Michele, could this be? thanks Marcel David R Robison wrote: We are experiencing something strange in xmlBlaster 1.6.1. We have two nodes, node A subscribes to messages from node B. These are heartbeat messages and are generated every 15 seconds with a lifetime of 30 seconds. A client connects to node A and subscribes to the messages, node A then passes the subscription onto node B. Watching the callback message queue, everything seems to run well, at most 1 message in the queue waiting to be sent. It can run like this for days. Then, unexpectedly, the callback queue will show as being full (in this case 1001 messages). The queue contains many duplicated messages with different
Re: [xmlblaster] Callback message queue fills up
I think part of the problem might be that the subscriptions, even when you specify a domain, are not domain specific. What I mean is that a user connected to B subscribes to messages for a domain that is mastered on A. However, when the subscription is forwarded to A, it matches messages from all domains, even those generated on B and sent to A. Does this make sense? Could this be part of the problem? David _ From: David R Robison [mailto:[EMAIL PROTECTED] To: xmlblaster@server.xmlBlaster.org Sent: Wed, 21 Nov 2007 10:41:10 -0500 Subject: Re: [xmlblaster] Callback message queue fills up Here is a dunp of one of the messages: MsgUnit index='0' key oid='DomainHeartbeat-Albemarle911' contentMime='text/xml' contentMimeExtended='1.0' domain='Albemarle911'/ content size='46'Domain Albemarle911 ALIVE at 11/21/07 09:48:43/content qos subscribable/ sender/node/Albemarle911/client/A-NATIVE-CLIENT-PLUGIN/-3/sender priorityMAX/priority subscribe id='__subId:StauntonSTC-XPATH119562846332900'/ expiration lifeTime='3' remainingLife='22703' forceDestroy='true'/ rcvTimestamp nanos='119565948261302'/ queue index='0' size='1'/ persistentfalse/persistent isUpdate/ /qos /MsgUnit The message was created on node B and sent to node A because of a subscription on node A. But it is now in the callback queue on A to go back to B. Also, I have never seen the route data in the messages. Is there a way to turn this on? David Marcel Ruff wrote: David R Robison wrote: One other thought. Heartbeat messages are published on node B and subscribed to by clients on node A. Also, there are clients on node B that subscribe to messages on node A. However, it appears that the subscriptions the clients on node B are using are also matching the heartbeat messages from node B that have been sent to node A. Could I have some kind of circular queue? A message is posted on B then sent to A because a subscription by a client on A. Then sent back to B because of a subscription by a client on B for messages on A. Then the message gets sent back to A and the whole cycle repeats? Could be, usually the cluster should prevent this ... The messages contain in their QoS the nodes traversed: qos senderjoe/sender route node id='bilbo' stratum='2' timestamp='34460239640'/ node id='frodo' stratum='1' timestamp='34460239661'/ node id='heron' stratum='0' timestamp='34460239590'/ /route /qos it would be nice to see the dump of such messages, Use the jconsole or logging output from your receiving client or use the message sniffer, e.g.: java javaclients.simplereader.SimpleReaderGui -xpath //key -session.name simpleReader -passwd secret -protocol SOCKET -dispatch/connection/plugin/socket/hostname 192.168.1.25 -dumpToFile true or peek the callback queue with administrative messages as described in one of your last posts, thanks Marcel Could this be possible? David David R Robison wrote: Thanks, See in line... Marcel Ruff wrote: Hi David, do you have a jconsole to observe the two nodes? I don't have a jconsole, but can I get the same using the admin messages? If yes, please check the number of subscriptions the node A has forwarded to node B (look into node B and check the number of subscriptions of client A) during such a case. In case the subscribeQos has set I will check. multiSubscribetrue/multiSubscribe I believe that we set all to false. (which is the default) it could be that the subscriptions multiplied during small connection errors and reconnects. This is just a guess. If it is the case please set multiSubscribe to false. Is there a high CPU load during the 1001 message case? No Are the hearbeat messages persistent messages? Yes, but the only live 30 seconds. At any given time there should only be at most 2 in the history queue Was the client connected or offline during this message overflow? No, the client was online Does your heartbeat have a unique id so that you can tell for sure if the same No, but the content of the message has a timestamp so I knew they were duplicates published message is cloned many times (try a peek on the callback queue with jconsole)? Can this be done with the admin messages A final option is to use the current svn xmlBlaster and switch on the checkpoint logging to get a better idea what is going on. We will try this in house, unfortunately, the problem nodes are in a production environment. And finally it could be a problem with your client not taking the callback messages. Could be, but what I don't see is the queue gradually growing. Instead, it all-of-a-sudden appears to be full. Another idea
[xmlblaster] Callback message queue fills up
We are experiencing something strange in xmlBlaster 1.6.1. We have two nodes, node A subscribes to messages from node B. These are heartbeat messages and are generated every 15 seconds with a lifetime of 30 seconds. A client connects to node A and subscribes to the messages, node A then passes the subscription onto node B. Watching the callback message queue, everything seems to run well, at most 1 message in the queue waiting to be sent. It can run like this for days. Then, unexpectedly, the callback queue will show as being full (in this case 1001 messages). The queue contains many duplicated messages with different timestamps. From there, the server struggles to deliver the messages and keep the queue empty. The reader never seems to read enough messages to get the queue back down to zero. If I stop the client and reconnect, it will recreate its queue and be back to normal. I know this is a bit sketchy, but it is becoming a real problem for us. Any thoughts on what might be the problem? Any idea of where to start looking? One more note, when the client is subscribing to heartbeats that are generated on Node A, the client never fails in this manor, only when it is subscribing to node A for a message generated on node B. Thanks, in advance, David Robison -- David R Robison Open Roads Consulting, Inc. 708 S. Battlefield Blvd., Chesapeake, VA 23322 phone: (757) 546-3401 e-mail: [EMAIL PROTECTED] web: http://openroadsconsulting.com blog: http://therobe.blogspot.com book: http://www.xulonpress.com/book_detail.php?id=2579
Re: [xmlblaster] Callback message queue fills up
Hi David, do you have a jconsole to observe the two nodes? If yes, please check the number of subscriptions the node A has forwarded to node B (look into node B and check the number of subscriptions of client A) during such a case. In case the subscribeQos has set multiSubscribetrue/multiSubscribe (which is the default) it could be that the subscriptions multiplied during small connection errors and reconnects. This is just a guess. If it is the case please set multiSubscribe to false. Is there a high CPU load during the 1001 message case? Are the hearbeat messages persistent messages? Was the client connected or offline during this message overflow? Does your heartbeat have a unique id so that you can tell for sure if the same published message is cloned many times (try a peek on the callback queue with jconsole)? A final option is to use the current svn xmlBlaster and switch on the checkpoint logging to get a better idea what is going on. And finally it could be a problem with your client not taking the callback messages. Another idea: The callback queue contains only a reference on the message. If it expires the message-'meat' is destroyed but the reference remains in the queue until it is looked at during delivery (and then thrown to garbage), Michele, could this be? thanks Marcel David R Robison wrote: We are experiencing something strange in xmlBlaster 1.6.1. We have two nodes, node A subscribes to messages from node B. These are heartbeat messages and are generated every 15 seconds with a lifetime of 30 seconds. A client connects to node A and subscribes to the messages, node A then passes the subscription onto node B. Watching the callback message queue, everything seems to run well, at most 1 message in the queue waiting to be sent. It can run like this for days. Then, unexpectedly, the callback queue will show as being full (in this case 1001 messages). The queue contains many duplicated messages with different timestamps. From there, the server struggles to deliver the messages and keep the queue empty. The reader never seems to read enough messages to get the queue back down to zero. If I stop the client and reconnect, it will recreate its queue and be back to normal. I know this is a bit sketchy, but it is becoming a real problem for us. Any thoughts on what might be the problem? Any idea of where to start looking? One more note, when the client is subscribing to heartbeats that are generated on Node A, the client never fails in this manor, only when it is subscribing to node A for a message generated on node B. Thanks, in advance, David Robison -- Marcel Ruff http://www.xmlBlaster.org
Re: [xmlblaster] Callback message queue fills up
Thanks, See in line... Marcel Ruff wrote: Hi David, do you have a jconsole to observe the two nodes? I don't have a jconsole, but can I get the same using the admin messages? If yes, please check the number of subscriptions the node A has forwarded to node B (look into node B and check the number of subscriptions of client A) during such a case. In case the subscribeQos has set I will check. multiSubscribetrue/multiSubscribe I believe that we set all to false. (which is the default) it could be that the subscriptions multiplied during small connection errors and reconnects. This is just a guess. If it is the case please set multiSubscribe to false. Is there a high CPU load during the 1001 message case? No Are the hearbeat messages persistent messages? Yes, but the only live 30 seconds. At any given time there should only be at most 2 in the history queue Was the client connected or offline during this message overflow? No, the client was online Does your heartbeat have a unique id so that you can tell for sure if the same No, but the content of the message has a timestamp so I knew they were duplicates published message is cloned many times (try a peek on the callback queue with jconsole)? Can this be done with the admin messages A final option is to use the current svn xmlBlaster and switch on the checkpoint logging to get a better idea what is going on. We will try this in house, unfortunately, the problem nodes are in a production environment. And finally it could be a problem with your client not taking the callback messages. Could be, but what I don't see is the queue gradually growing. Instead, it all-of-a-sudden appears to be full. Another idea: The callback queue contains only a reference on the message. If it expires the message-'meat' is destroyed but the reference remains in the queue until it is looked at during delivery (and then thrown to garbage), Michele, could this be? thanks Marcel David R Robison wrote: We are experiencing something strange in xmlBlaster 1.6.1. We have two nodes, node A subscribes to messages from node B. These are heartbeat messages and are generated every 15 seconds with a lifetime of 30 seconds. A client connects to node A and subscribes to the messages, node A then passes the subscription onto node B. Watching the callback message queue, everything seems to run well, at most 1 message in the queue waiting to be sent. It can run like this for days. Then, unexpectedly, the callback queue will show as being full (in this case 1001 messages). The queue contains many duplicated messages with different timestamps. From there, the server struggles to deliver the messages and keep the queue empty. The reader never seems to read enough messages to get the queue back down to zero. If I stop the client and reconnect, it will recreate its queue and be back to normal. I know this is a bit sketchy, but it is becoming a real problem for us. Any thoughts on what might be the problem? Any idea of where to start looking? One more note, when the client is subscribing to heartbeats that are generated on Node A, the client never fails in this manor, only when it is subscribing to node A for a message generated on node B. Thanks, in advance, David Robison -- David R Robison Open Roads Consulting, Inc. 708 S. Battlefield Blvd., Chesapeake, VA 23322 phone: (757) 546-3401 e-mail: [EMAIL PROTECTED] web: http://openroadsconsulting.com blog: http://therobe.blogspot.com book: http://www.xulonpress.com/book_detail.php?id=2579