Re: [xmlblaster] Callback message queue fills up

2007-11-28 Thread David R Robison
I don't see any routing information even though I know that it is being 
routed from one node to another. Do I need to turn it on some way? I am 
looking at the updateQos in the callback, should the route be there?

Thanks, David

Marcel Ruff wrote:

David Robison wrote:
I think part of the problem might be that the subscriptions, even 
when you specify a domain, are not domain specific. What I mean is 
that a user connected to B subscribes to messages for a domain that 
is mastered on A. However, when the subscription is forwarded to A, 
it matches messages from all domains, even those generated on B and 
sent to A. Does this make sense? Could this be part of the problem?

It boils down to the question if the oid and domain are ANDed?
(B is slave, A is master of Sport)

B-client: subscribe( oid=Hello domain=Sport )
- ends up in A as A is master of Sport

A-client: publish( oid=Hello domain= )
- is matched in A and forwarded to B and then to B-client

So, as you mentioned, the domain is not ANDed.
But i still can't see this as the reason for your filled up callback 
queue.



Note:
If a published message is forwarded to another cluster node you will
see something like
route
  node id='B' stratum='1' timestamp='119615134316000' 
dirtyRead='false'/
  node id='A' stratum='0' timestamp='119615134316400' 
dirtyRead='false'/

/route
in the publishQos.

best regards,
Marcel


David




*From:* David R Robison [mailto:[EMAIL PROTECTED]
*To:* xmlblaster@server.xmlBlaster.org
*Sent:* Wed, 21 Nov 2007 10:41:10 -0500
*Subject:* Re: [xmlblaster] Callback message queue fills up

Here is a dunp of one of the messages:

MsgUnit index='0'

key oid='DomainHeartbeat-Albemarle911' contentMime='text/xml'
contentMimeExtended='1.0' domain='Albemarle911'/
content size='46'Domain Albemarle911 ALIVE at 11/21/07
09:48:43/content

qos
subscribable/
sender/node/Albemarle911/client/A-NATIVE-CLIENT-PLUGIN/-3/sender
priorityMAX/priority
subscribe id='__subId:StauntonSTC-XPATH119562846332900'/
expiration lifeTime='3' remainingLife='22703'
forceDestroy='true'/
rcvTimestamp nanos='119565948261302'/
queue index='0' size='1'/
persistentfalse/persistent
isUpdate/
/qos
/MsgUnit

The message was created on node B and sent to node A because of a
subscription on node A. But it is now in the callback queue on A
to go
back to B. Also, I have never seen the route data in the 
messages. Is

there a way to turn this on?

David

Marcel Ruff wrote:
 David R Robison wrote:
 One other thought. Heartbeat messages are published on node B and
 subscribed to by clients on node A. Also, there are clients on
node B
 that subscribe to messages on node A. However, it appears that 
the

 subscriptions the clients on node B are using are also matching
the
 heartbeat messages from node B that have been sent to node A.
Could I
 have some kind of circular queue? A message is posted on B then
sent
 to A because a subscription by a client on A. Then sent back to B
 because of a subscription by a client on B for messages on A. 
Then

 the message gets sent back to A and the whole cycle repeats?
 Could be, usually the cluster should prevent this ...
 The messages contain in their QoS the nodes traversed:

 qos
 senderjoe/sender
 route
 node id='bilbo' stratum='2' timestamp='34460239640'/
 node id='frodo' stratum='1' timestamp='34460239661'/
 node id='heron' stratum='0' timestamp='34460239590'/
 /route
 /qos

 it would be nice to see the dump of such messages,
 Use the jconsole or logging output from your receiving client or
use the
 message sniffer, e.g.:
 java javaclients.simplereader.SimpleReaderGui -xpath //key
 -session.name simpleReader -passwd secret -protocol SOCKET
 -dispatch/connection/plugin/socket/hostname 192.168.1.25
-dumpToFile true
 or peek the callback queue with administrative messages as
described
 in one of your last posts,

 thanks
 Marcel


 Could this be possible? David

 David R Robison wrote:
 Thanks, See in line...

 Marcel Ruff wrote:
 Hi David,

 do you have a jconsole to observe the two nodes?
 I don't have a jconsole, but can I get the same using the admin
 messages?

 If yes, please check the number of subscriptions the node A has
 forwarded to node B
 (look into node B and check the number of subscriptions of
client
 A) during such a case.
 In case the subscribeQos has set
 I will check.

 multiSubscribetrue/multiSubscribe
 I believe that we set all to false.

 (which is the default) it could be that the subscriptions
multiplied
 during small connection errors

Re: [xmlblaster] Callback message queue fills up

2007-11-28 Thread David R Robison
Could it be related to the fact that the message is published by a 
plugin? David


David R Robison wrote:
I don't see any routing information even though I know that it is 
being routed from one node to another. Do I need to turn it on some 
way? I am looking at the updateQos in the callback, should the route 
be there?

Thanks, David

Marcel Ruff wrote:

David Robison wrote:
I think part of the problem might be that the subscriptions, even 
when you specify a domain, are not domain specific. What I mean is 
that a user connected to B subscribes to messages for a domain that 
is mastered on A. However, when the subscription is forwarded to A, 
it matches messages from all domains, even those generated on B and 
sent to A. Does this make sense? Could this be part of the problem?

It boils down to the question if the oid and domain are ANDed?
(B is slave, A is master of Sport)

B-client: subscribe( oid=Hello domain=Sport )
- ends up in A as A is master of Sport

A-client: publish( oid=Hello domain= )
- is matched in A and forwarded to B and then to B-client

So, as you mentioned, the domain is not ANDed.
But i still can't see this as the reason for your filled up callback 
queue.



Note:
If a published message is forwarded to another cluster node you will
see something like
route
  node id='B' stratum='1' timestamp='119615134316000' 
dirtyRead='false'/
  node id='A' stratum='0' timestamp='119615134316400' 
dirtyRead='false'/

/route
in the publishQos.

best regards,
Marcel


David


 


*From:* David R Robison [mailto:[EMAIL PROTECTED]
*To:* xmlblaster@server.xmlBlaster.org
*Sent:* Wed, 21 Nov 2007 10:41:10 -0500
*Subject:* Re: [xmlblaster] Callback message queue fills up

Here is a dunp of one of the messages:

MsgUnit index='0'

key oid='DomainHeartbeat-Albemarle911' contentMime='text/xml'
contentMimeExtended='1.0' domain='Albemarle911'/
content size='46'Domain Albemarle911 ALIVE at 11/21/07
09:48:43/content

qos
subscribable/

sender/node/Albemarle911/client/A-NATIVE-CLIENT-PLUGIN/-3/sender

priorityMAX/priority
subscribe id='__subId:StauntonSTC-XPATH119562846332900'/
expiration lifeTime='3' remainingLife='22703'
forceDestroy='true'/
rcvTimestamp nanos='119565948261302'/
queue index='0' size='1'/
persistentfalse/persistent
isUpdate/
/qos
/MsgUnit

The message was created on node B and sent to node A because of a
subscription on node A. But it is now in the callback queue on A
to go
back to B. Also, I have never seen the route data in the 
messages. Is

there a way to turn this on?

David

Marcel Ruff wrote:
 David R Robison wrote:
 One other thought. Heartbeat messages are published on node B 
and

 subscribed to by clients on node A. Also, there are clients on
node B
 that subscribe to messages on node A. However, it appears 
that the

 subscriptions the clients on node B are using are also matching
the
 heartbeat messages from node B that have been sent to node A.
Could I
 have some kind of circular queue? A message is posted on B then
sent
 to A because a subscription by a client on A. Then sent back 
to B
 because of a subscription by a client on B for messages on A. 
Then

 the message gets sent back to A and the whole cycle repeats?
 Could be, usually the cluster should prevent this ...
 The messages contain in their QoS the nodes traversed:

 qos
 senderjoe/sender
 route
 node id='bilbo' stratum='2' timestamp='34460239640'/
 node id='frodo' stratum='1' timestamp='34460239661'/
 node id='heron' stratum='0' timestamp='34460239590'/
 /route
 /qos

 it would be nice to see the dump of such messages,
 Use the jconsole or logging output from your receiving client or
use the
 message sniffer, e.g.:
 java javaclients.simplereader.SimpleReaderGui -xpath //key
 -session.name simpleReader -passwd secret -protocol SOCKET
 -dispatch/connection/plugin/socket/hostname 192.168.1.25
-dumpToFile true
 or peek the callback queue with administrative messages as
described
 in one of your last posts,

 thanks
 Marcel


 Could this be possible? David

 David R Robison wrote:
 Thanks, See in line...

 Marcel Ruff wrote:
 Hi David,

 do you have a jconsole to observe the two nodes?
 I don't have a jconsole, but can I get the same using the admin
 messages?

 If yes, please check the number of subscriptions the node A 
has

 forwarded to node B
 (look into node B and check the number of subscriptions of
client
 A) during such a case.
 In case the subscribeQos has set
 I will check.

 multiSubscribetrue/multiSubscribe
 I believe that we set all to false

Re: [xmlblaster] Callback message queue fills up

2007-11-27 Thread Marcel Ruff

David Robison wrote:
I think part of the problem might be that the subscriptions, even when 
you specify a domain, are not domain specific. What I mean is that a 
user connected to B subscribes to messages for a domain that is 
mastered on A. However, when the subscription is forwarded to A, it 
matches messages from all domains, even those generated on B and sent 
to A. Does this make sense? Could this be part of the problem?

It boils down to the question if the oid and domain are ANDed?
(B is slave, A is master of Sport)

B-client: subscribe( oid=Hello domain=Sport )
- ends up in A as A is master of Sport

A-client: publish( oid=Hello domain= )
- is matched in A and forwarded to B and then to B-client

So, as you mentioned, the domain is not ANDed.
But i still can't see this as the reason for your filled up callback queue.


Note:
If a published message is forwarded to another cluster node you will
see something like
route
  node id='B' stratum='1' timestamp='119615134316000' 
dirtyRead='false'/
  node id='A' stratum='0' timestamp='119615134316400' 
dirtyRead='false'/

/route
in the publishQos.

best regards,
Marcel


David


*From:* David R Robison [mailto:[EMAIL PROTECTED]
*To:* xmlblaster@server.xmlBlaster.org
*Sent:* Wed, 21 Nov 2007 10:41:10 -0500
*Subject:* Re: [xmlblaster] Callback message queue fills up

Here is a dunp of one of the messages:

MsgUnit index='0'

key oid='DomainHeartbeat-Albemarle911' contentMime='text/xml'
contentMimeExtended='1.0' domain='Albemarle911'/
content size='46'Domain Albemarle911 ALIVE at 11/21/07
09:48:43/content

qos
subscribable/
sender/node/Albemarle911/client/A-NATIVE-CLIENT-PLUGIN/-3/sender
priorityMAX/priority
subscribe id='__subId:StauntonSTC-XPATH119562846332900'/
expiration lifeTime='3' remainingLife='22703'
forceDestroy='true'/
rcvTimestamp nanos='119565948261302'/
queue index='0' size='1'/
persistentfalse/persistent
isUpdate/
/qos
/MsgUnit

The message was created on node B and sent to node A because of a
subscription on node A. But it is now in the callback queue on A
to go
back to B. Also, I have never seen the route data in the messages. Is
there a way to turn this on?

David

Marcel Ruff wrote:
 David R Robison wrote:
 One other thought. Heartbeat messages are published on node B and
 subscribed to by clients on node A. Also, there are clients on
node B
 that subscribe to messages on node A. However, it appears that the
 subscriptions the clients on node B are using are also matching
the
 heartbeat messages from node B that have been sent to node A.
Could I
 have some kind of circular queue? A message is posted on B then
sent
 to A because a subscription by a client on A. Then sent back to B
 because of a subscription by a client on B for messages on A. Then
 the message gets sent back to A and the whole cycle repeats?
 Could be, usually the cluster should prevent this ...
 The messages contain in their QoS the nodes traversed:

 qos
 senderjoe/sender
 route
 node id='bilbo' stratum='2' timestamp='34460239640'/
 node id='frodo' stratum='1' timestamp='34460239661'/
 node id='heron' stratum='0' timestamp='34460239590'/
 /route
 /qos

 it would be nice to see the dump of such messages,
 Use the jconsole or logging output from your receiving client or
use the
 message sniffer, e.g.:
 java javaclients.simplereader.SimpleReaderGui -xpath //key
 -session.name simpleReader -passwd secret -protocol SOCKET
 -dispatch/connection/plugin/socket/hostname 192.168.1.25
-dumpToFile true
 or peek the callback queue with administrative messages as
described
 in one of your last posts,

 thanks
 Marcel


 Could this be possible? David

 David R Robison wrote:
 Thanks, See in line...

 Marcel Ruff wrote:
 Hi David,

 do you have a jconsole to observe the two nodes?
 I don't have a jconsole, but can I get the same using the admin
 messages?

 If yes, please check the number of subscriptions the node A has
 forwarded to node B
 (look into node B and check the number of subscriptions of
client
 A) during such a case.
 In case the subscribeQos has set
 I will check.

 multiSubscribetrue/multiSubscribe
 I believe that we set all to false.

 (which is the default) it could be that the subscriptions
multiplied
 during small connection errors and reconnects.
 This is just a guess.
 If it is the case please set multiSubscribe to false.

 Is there a high CPU load during the 1001 message case?
 No
 Are the hearbeat messages persistent messages?
 Yes, but the only live 30

Re: [xmlblaster] Callback message queue fills up

2007-11-21 Thread Marcel Ruff

David R Robison wrote:

Thanks, See in line...

Marcel Ruff wrote:

Hi David,

do you have a jconsole to observe the two nodes?
I don't have a jconsole, but can I get the same using the admin messages? 
You can, but jconsole will save you (and me :-) a lot of time, - really 
- try to set up jconsole observation!


You need a JDK 1.5 or 1.6 to be installed on your production nodes
then you can just fire up the jconsole.
This is how I do it: If the production node is a Windows use RDP if it 
is a UNIX use nomachine (or X or VNC).


If you don't have grafical access to the production machines but you 
have ssh

access you can configure to tunnel the jconsole data over ssl and start
the jconsole locally on your desktop (no new security hole, just the 
existing ssh).

For ssh i can send you an example setup (private/public key exchange etc).

You need to configure the running xmlBlaster to allow jconsole access, see
http://www.xmlblaster.org/xmlBlaster/doc/requirements/admin.jmx.html

regards
Marcel


--
Marcel Ruff
http://www.xmlBlaster.org



Re: [xmlblaster] Callback message queue fills up

2007-11-21 Thread David R Robison
One other thought. Heartbeat messages are published on node B and 
subscribed to by clients on node A. Also, there are clients on node B 
that subscribe to messages on node A. However, it appears that the 
subscriptions the clients on node B are using are also matching the 
heartbeat messages from node B that have been sent to node A. Could I 
have some kind of circular queue? A message is posted on B then sent to 
A because a subscription by a client on A. Then sent back to B because 
of a subscription by a client on B for messages on A. Then the message 
gets sent back to A and the whole cycle repeats?


Could this be possible? David

David R Robison wrote:

Thanks, See in line...

Marcel Ruff wrote:

Hi David,

do you have a jconsole to observe the two nodes?

I don't have a jconsole, but can I get the same using the admin messages?


If yes, please check the number of subscriptions the node A has 
forwarded to node B
(look into node B and check the number of subscriptions of client A) 
during such a case.

In case the subscribeQos has set

I will check.


multiSubscribetrue/multiSubscribe

I believe that we set all to false.


(which is the default) it could be that the subscriptions multiplied
during small connection errors and reconnects.
This is just a guess.
If it is the case please set multiSubscribe to false.

Is there a high CPU load during the 1001 message case?

No

Are the hearbeat messages persistent messages?
Yes, but the only live 30 seconds. At any given time there should only 
be at most 2 in the history queue

Was the client connected or offline during this message overflow?

No, the client was online
Does your heartbeat have a unique id so that you can tell for sure if 
the same
No, but the content of the message has a timestamp so I knew they were 
duplicates
published message is cloned many times (try a peek on the callback 
queue with jconsole)?

Can this be done with the admin messages


A final option is to use the current svn xmlBlaster and switch on the 
checkpoint logging

to get a better idea what is going on.
We will try this in house, unfortunately, the problem nodes are in a 
production environment.


And finally it could be a problem with your client not taking the 
callback messages.
Could be, but what I don't see is the queue gradually growing. 
Instead, it all-of-a-sudden appears to be full.


Another idea: The callback queue contains only a reference on the 
message.
If it expires the message-'meat' is destroyed but the reference 
remains in the queue
until it is looked at during delivery (and then thrown to garbage), 
Michele, could this be?


thanks
Marcel


David R Robison wrote:
We are experiencing something strange in xmlBlaster 1.6.1. We have 
two nodes, node A subscribes to messages from node B. These are 
heartbeat messages and are generated every 15 seconds with a 
lifetime of 30 seconds. A client connects to node A and subscribes 
to the messages, node A then passes the subscription onto node B. 
Watching the callback message queue, everything seems to run well, 
at most 1 message in the queue waiting to be sent. It can run like 
this for days. Then, unexpectedly, the callback queue will show as 
being full (in this case 1001 messages). The queue contains many 
duplicated messages with different timestamps. From there, the 
server struggles to deliver the messages and keep the queue empty. 
The reader never seems to read enough messages to get the queue back 
down to zero. If I stop the client and reconnect, it will recreate 
its queue and be back to normal. I know this is a bit sketchy, but 
it is becoming a real problem for us.


Any thoughts on what might be the problem? Any idea of where to 
start looking?


One more note, when the client is subscribing to heartbeats that are 
generated on Node A, the client never fails in this manor, only when 
it is subscribing to node A for a message generated on node B.


Thanks, in advance,
David Robison







--

David R Robison
Open Roads Consulting, Inc.
708 S. Battlefield Blvd., Chesapeake, VA 23322
phone: (757) 546-3401
e-mail: [EMAIL PROTECTED]
web: http://openroadsconsulting.com
blog: http://therobe.blogspot.com
book: http://www.xulonpress.com/book_detail.php?id=2579






Re: [xmlblaster] Callback message queue fills up

2007-11-21 Thread Marcel Ruff

David R Robison wrote:
One other thought. Heartbeat messages are published on node B and 
subscribed to by clients on node A. Also, there are clients on node B 
that subscribe to messages on node A. However, it appears that the 
subscriptions the clients on node B are using are also matching the 
heartbeat messages from node B that have been sent to node A. Could I 
have some kind of circular queue? A message is posted on B then sent 
to A because a subscription by a client on A. Then sent back to B 
because of a subscription by a client on B for messages on A. Then the 
message gets sent back to A and the whole cycle repeats?

Could be, usually the cluster should prevent this  ...
The messages contain in their QoS the nodes traversed:

qos
  senderjoe/sender
  route
 node id='bilbo' stratum='2' timestamp='34460239640'/
 node id='frodo' stratum='1' timestamp='34460239661'/
 node id='heron' stratum='0' timestamp='34460239590'/
  /route
/qos

it would be nice to see the dump of such messages,
Use the jconsole or logging output from your receiving client or use the
message sniffer, e.g.:
java javaclients.simplereader.SimpleReaderGui -xpath //key 
-session.name simpleReader -passwd secret -protocol SOCKET 
-dispatch/connection/plugin/socket/hostname 192.168.1.25 -dumpToFile true
or peek the callback queue with administrative messages as described in 
one of your last posts,


thanks
Marcel



Could this be possible? David

David R Robison wrote:

Thanks, See in line...

Marcel Ruff wrote:

Hi David,

do you have a jconsole to observe the two nodes?
I don't have a jconsole, but can I get the same using the admin 
messages?


If yes, please check the number of subscriptions the node A has 
forwarded to node B
(look into node B and check the number of subscriptions of client A) 
during such a case.

In case the subscribeQos has set

I will check.


multiSubscribetrue/multiSubscribe

I believe that we set all to false.


(which is the default) it could be that the subscriptions multiplied
during small connection errors and reconnects.
This is just a guess.
If it is the case please set multiSubscribe to false.

Is there a high CPU load during the 1001 message case?

No

Are the hearbeat messages persistent messages?
Yes, but the only live 30 seconds. At any given time there should 
only be at most 2 in the history queue

Was the client connected or offline during this message overflow?

No, the client was online
Does your heartbeat have a unique id so that you can tell for sure 
if the same
No, but the content of the message has a timestamp so I knew they 
were duplicates
published message is cloned many times (try a peek on the callback 
queue with jconsole)?

Can this be done with the admin messages


A final option is to use the current svn xmlBlaster and switch on 
the checkpoint logging

to get a better idea what is going on.
We will try this in house, unfortunately, the problem nodes are in a 
production environment.


And finally it could be a problem with your client not taking the 
callback messages.
Could be, but what I don't see is the queue gradually growing. 
Instead, it all-of-a-sudden appears to be full.


Another idea: The callback queue contains only a reference on the 
message.
If it expires the message-'meat' is destroyed but the reference 
remains in the queue
until it is looked at during delivery (and then thrown to garbage), 
Michele, could this be?


thanks
Marcel


David R Robison wrote:
We are experiencing something strange in xmlBlaster 1.6.1. We have 
two nodes, node A subscribes to messages from node B. These are 
heartbeat messages and are generated every 15 seconds with a 
lifetime of 30 seconds. A client connects to node A and subscribes 
to the messages, node A then passes the subscription onto node B. 
Watching the callback message queue, everything seems to run well, 
at most 1 message in the queue waiting to be sent. It can run like 
this for days. Then, unexpectedly, the callback queue will show as 
being full (in this case 1001 messages). The queue contains many 
duplicated messages with different timestamps. From there, the 
server struggles to deliver the messages and keep the queue empty. 
The reader never seems to read enough messages to get the queue 
back down to zero. If I stop the client and reconnect, it will 
recreate its queue and be back to normal. I know this is a bit 
sketchy, but it is becoming a real problem for us.


Any thoughts on what might be the problem? Any idea of where to 
start looking?


One more note, when the client is subscribing to heartbeats that 
are generated on Node A, the client never fails in this manor, only 
when it is subscribing to node A for a message generated on node B.


Thanks, in advance,
David Robison










--
Marcel Ruff
http://www.xmlBlaster.org



Re: [xmlblaster] Callback message queue fills up

2007-11-21 Thread David R Robison

Here is a dunp of one of the messages:

MsgUnit index='0'

key oid='DomainHeartbeat-Albemarle911' contentMime='text/xml' 
contentMimeExtended='1.0' domain='Albemarle911'/
 content size='46'Domain Albemarle911 ALIVE at 11/21/07 
09:48:43/content


qos
 subscribable/
 sender/node/Albemarle911/client/A-NATIVE-CLIENT-PLUGIN/-3/sender
 priorityMAX/priority
 subscribe id='__subId:StauntonSTC-XPATH119562846332900'/
 expiration lifeTime='3' remainingLife='22703' forceDestroy='true'/
 rcvTimestamp nanos='119565948261302'/
 queue index='0' size='1'/
 persistentfalse/persistent
 isUpdate/
/qos
/MsgUnit

The message was created on node B and sent to node A because of a 
subscription on node A. But it is now in the callback queue on A to go 
back to B. Also, I have never seen the route data in the messages. Is 
there a way to turn this on?


David

Marcel Ruff wrote:

David R Robison wrote:
One other thought. Heartbeat messages are published on node B and 
subscribed to by clients on node A. Also, there are clients on node B 
that subscribe to messages on node A. However, it appears that the 
subscriptions the clients on node B are using are also matching the 
heartbeat messages from node B that have been sent to node A. Could I 
have some kind of circular queue? A message is posted on B then sent 
to A because a subscription by a client on A. Then sent back to B 
because of a subscription by a client on B for messages on A. Then 
the message gets sent back to A and the whole cycle repeats?

Could be, usually the cluster should prevent this  ...
The messages contain in their QoS the nodes traversed:

qos
  senderjoe/sender
  route
 node id='bilbo' stratum='2' timestamp='34460239640'/
 node id='frodo' stratum='1' timestamp='34460239661'/
 node id='heron' stratum='0' timestamp='34460239590'/
  /route
/qos

it would be nice to see the dump of such messages,
Use the jconsole or logging output from your receiving client or use the
message sniffer, e.g.:
java javaclients.simplereader.SimpleReaderGui -xpath //key 
-session.name simpleReader -passwd secret -protocol SOCKET 
-dispatch/connection/plugin/socket/hostname 192.168.1.25 -dumpToFile true
or peek the callback queue with administrative messages as described 
in one of your last posts,


thanks
Marcel



Could this be possible? David

David R Robison wrote:

Thanks, See in line...

Marcel Ruff wrote:

Hi David,

do you have a jconsole to observe the two nodes?
I don't have a jconsole, but can I get the same using the admin 
messages?


If yes, please check the number of subscriptions the node A has 
forwarded to node B
(look into node B and check the number of subscriptions of client 
A) during such a case.

In case the subscribeQos has set

I will check.


multiSubscribetrue/multiSubscribe

I believe that we set all to false.


(which is the default) it could be that the subscriptions multiplied
during small connection errors and reconnects.
This is just a guess.
If it is the case please set multiSubscribe to false.

Is there a high CPU load during the 1001 message case?

No

Are the hearbeat messages persistent messages?
Yes, but the only live 30 seconds. At any given time there should 
only be at most 2 in the history queue

Was the client connected or offline during this message overflow?

No, the client was online
Does your heartbeat have a unique id so that you can tell for sure 
if the same
No, but the content of the message has a timestamp so I knew they 
were duplicates
published message is cloned many times (try a peek on the callback 
queue with jconsole)?

Can this be done with the admin messages


A final option is to use the current svn xmlBlaster and switch on 
the checkpoint logging

to get a better idea what is going on.
We will try this in house, unfortunately, the problem nodes are in a 
production environment.


And finally it could be a problem with your client not taking the 
callback messages.
Could be, but what I don't see is the queue gradually growing. 
Instead, it all-of-a-sudden appears to be full.


Another idea: The callback queue contains only a reference on the 
message.
If it expires the message-'meat' is destroyed but the reference 
remains in the queue
until it is looked at during delivery (and then thrown to garbage), 
Michele, could this be?


thanks
Marcel


David R Robison wrote:
We are experiencing something strange in xmlBlaster 1.6.1. We have 
two nodes, node A subscribes to messages from node B. These are 
heartbeat messages and are generated every 15 seconds with a 
lifetime of 30 seconds. A client connects to node A and subscribes 
to the messages, node A then passes the subscription onto node B. 
Watching the callback message queue, everything seems to run well, 
at most 1 message in the queue waiting to be sent. It can run like 
this for days. Then, unexpectedly, the callback queue will show as 
being full (in this case 1001 messages). The queue contains many 
duplicated messages with different 

Re: [xmlblaster] Callback message queue fills up

2007-11-21 Thread David Robison
I think part of the problem might be that the subscriptions, even when you 
specify a domain, are not domain specific. What I mean is that a user connected 
to B subscribes to messages for a domain that is mastered on A. However, when 
the subscription is forwarded to A, it matches messages from all domains, even 
those generated on B and sent to A. Does this make sense? Could this be part of 
the problem?

David
  _  

From: David R Robison [mailto:[EMAIL PROTECTED]
To: xmlblaster@server.xmlBlaster.org
Sent: Wed, 21 Nov 2007 10:41:10 -0500
Subject: Re: [xmlblaster] Callback message queue fills up

Here is a dunp of one of the messages:
  
  MsgUnit index='0'
   
   key oid='DomainHeartbeat-Albemarle911' contentMime='text/xml' 
  contentMimeExtended='1.0' domain='Albemarle911'/
content size='46'Domain Albemarle911 ALIVE at 11/21/07 
  09:48:43/content
   
   qos
subscribable/
sender/node/Albemarle911/client/A-NATIVE-CLIENT-PLUGIN/-3/sender
priorityMAX/priority
subscribe id='__subId:StauntonSTC-XPATH119562846332900'/
expiration lifeTime='3' remainingLife='22703' forceDestroy='true'/
rcvTimestamp nanos='119565948261302'/
queue index='0' size='1'/
persistentfalse/persistent
isUpdate/
   /qos
  /MsgUnit
  
  The message was created on node B and sent to node A because of a 
  subscription on node A. But it is now in the callback queue on A to go 
  back to B. Also, I have never seen the route data in the messages. Is 
  there a way to turn this on?
  
  David
  
  Marcel Ruff wrote:
   David R Robison wrote:
   One other thought. Heartbeat messages are published on node B and 
   subscribed to by clients on node A. Also, there are clients on node B 
   that subscribe to messages on node A. However, it appears that the 
   subscriptions the clients on node B are using are also matching the 
   heartbeat messages from node B that have been sent to node A. Could I 
   have some kind of circular queue? A message is posted on B then sent 
   to A because a subscription by a client on A. Then sent back to B 
   because of a subscription by a client on B for messages on A. Then 
   the message gets sent back to A and the whole cycle repeats?
   Could be, usually the cluster should prevent this  ...
   The messages contain in their QoS the nodes traversed:
  
   qos
 senderjoe/sender
 route
node id='bilbo' stratum='2' timestamp='34460239640'/
node id='frodo' stratum='1' timestamp='34460239661'/
node id='heron' stratum='0' timestamp='34460239590'/
 /route
   /qos
  
   it would be nice to see the dump of such messages,
   Use the jconsole or logging output from your receiving client or use the
   message sniffer, e.g.:
   java javaclients.simplereader.SimpleReaderGui -xpath //key 
   -session.name simpleReader -passwd secret -protocol SOCKET 
   -dispatch/connection/plugin/socket/hostname 192.168.1.25 -dumpToFile true
   or peek the callback queue with administrative messages as described 
   in one of your last posts,
  
   thanks
   Marcel
  
  
   Could this be possible? David
  
   David R Robison wrote:
   Thanks, See in line...
  
   Marcel Ruff wrote:
   Hi David,
  
   do you have a jconsole to observe the two nodes?
   I don't have a jconsole, but can I get the same using the admin 
   messages?
  
   If yes, please check the number of subscriptions the node A has 
   forwarded to node B
   (look into node B and check the number of subscriptions of client 
   A) during such a case.
   In case the subscribeQos has set
   I will check.
  
   multiSubscribetrue/multiSubscribe
   I believe that we set all to false.
  
   (which is the default) it could be that the subscriptions multiplied
   during small connection errors and reconnects.
   This is just a guess.
   If it is the case please set multiSubscribe to false.
  
   Is there a high CPU load during the 1001 message case?
   No
   Are the hearbeat messages persistent messages?
   Yes, but the only live 30 seconds. At any given time there should 
   only be at most 2 in the history queue
   Was the client connected or offline during this message overflow?
   No, the client was online
   Does your heartbeat have a unique id so that you can tell for sure 
   if the same
   No, but the content of the message has a timestamp so I knew they 
   were duplicates
   published message is cloned many times (try a peek on the callback 
   queue with jconsole)?
   Can this be done with the admin messages
  
   A final option is to use the current svn xmlBlaster and switch on 
   the checkpoint logging
   to get a better idea what is going on.
   We will try this in house, unfortunately, the problem nodes are in a 
   production environment.
  
   And finally it could be a problem with your client not taking the 
   callback messages.
   Could be, but what I don't see is the queue gradually growing. 
   Instead, it all-of-a-sudden appears to be full.
  
   Another idea

[xmlblaster] Callback message queue fills up

2007-11-20 Thread David R Robison
We are experiencing something strange in xmlBlaster 1.6.1. We have two 
nodes, node A subscribes to messages from node B. These are heartbeat 
messages and are generated every 15 seconds with a lifetime of 30 
seconds. A client connects to node A and subscribes to the messages, 
node A then passes the subscription onto node B. Watching the callback 
message queue, everything seems to run well, at most 1 message in the 
queue waiting to be sent. It can run like this for days. Then, 
unexpectedly, the callback queue will show as being full (in this case 
1001 messages). The queue contains many duplicated messages with 
different timestamps. From there, the server struggles to deliver the 
messages and keep the queue empty. The reader never seems to read enough 
messages to get the queue back down to zero. If I stop the client and 
reconnect, it will recreate its queue and be back to normal. I know this 
is a bit sketchy, but it is becoming a real problem for us.


Any thoughts on what might be the problem? Any idea of where to start 
looking?


One more note, when the client is subscribing to heartbeats that are 
generated on Node A, the client never fails in this manor, only when it 
is subscribing to node A for a message generated on node B.


Thanks, in advance,
David Robison

--

David R Robison
Open Roads Consulting, Inc.
708 S. Battlefield Blvd., Chesapeake, VA 23322
phone: (757) 546-3401
e-mail: [EMAIL PROTECTED]
web: http://openroadsconsulting.com
blog: http://therobe.blogspot.com
book: http://www.xulonpress.com/book_detail.php?id=2579






Re: [xmlblaster] Callback message queue fills up

2007-11-20 Thread Marcel Ruff

Hi David,

do you have a jconsole to observe the two nodes?

If yes, please check the number of subscriptions the node A has 
forwarded to node B
(look into node B and check the number of subscriptions of client A) 
during such a case.

In case the subscribeQos has set

multiSubscribetrue/multiSubscribe

(which is the default) it could be that the subscriptions multiplied
during small connection errors and reconnects.
This is just a guess.
If it is the case please set multiSubscribe to false.

Is there a high CPU load during the 1001 message case?
Are the hearbeat messages persistent messages?
Was the client connected or offline during this message overflow?
Does your heartbeat have a unique id so that you can tell for sure if the same
published message is cloned many times (try a peek on the callback queue with 
jconsole)?

A final option is to use the current svn xmlBlaster and switch on the 
checkpoint logging
to get a better idea what is going on.

And finally it could be a problem with your client not taking the callback 
messages.

Another idea: The callback queue contains only a reference on the message.
If it expires the message-'meat' is destroyed but the reference remains in the 
queue
until it is looked at during delivery (and then thrown to garbage), Michele, 
could this be?

thanks
Marcel


David R Robison wrote:
We are experiencing something strange in xmlBlaster 1.6.1. We have two 
nodes, node A subscribes to messages from node B. These are heartbeat 
messages and are generated every 15 seconds with a lifetime of 30 
seconds. A client connects to node A and subscribes to the messages, 
node A then passes the subscription onto node B. Watching the callback 
message queue, everything seems to run well, at most 1 message in the 
queue waiting to be sent. It can run like this for days. Then, 
unexpectedly, the callback queue will show as being full (in this case 
1001 messages). The queue contains many duplicated messages with 
different timestamps. From there, the server struggles to deliver the 
messages and keep the queue empty. The reader never seems to read 
enough messages to get the queue back down to zero. If I stop the 
client and reconnect, it will recreate its queue and be back to 
normal. I know this is a bit sketchy, but it is becoming a real 
problem for us.


Any thoughts on what might be the problem? Any idea of where to start 
looking?


One more note, when the client is subscribing to heartbeats that are 
generated on Node A, the client never fails in this manor, only when 
it is subscribing to node A for a message generated on node B.


Thanks, in advance,
David Robison



--
Marcel Ruff
http://www.xmlBlaster.org



Re: [xmlblaster] Callback message queue fills up

2007-11-20 Thread David R Robison

Thanks, See in line...

Marcel Ruff wrote:

Hi David,

do you have a jconsole to observe the two nodes?

I don't have a jconsole, but can I get the same using the admin messages?


If yes, please check the number of subscriptions the node A has 
forwarded to node B
(look into node B and check the number of subscriptions of client A) 
during such a case.

In case the subscribeQos has set

I will check.


multiSubscribetrue/multiSubscribe

I believe that we set all to false.


(which is the default) it could be that the subscriptions multiplied
during small connection errors and reconnects.
This is just a guess.
If it is the case please set multiSubscribe to false.

Is there a high CPU load during the 1001 message case?

No

Are the hearbeat messages persistent messages?
Yes, but the only live 30 seconds. At any given time there should only 
be at most 2 in the history queue

Was the client connected or offline during this message overflow?

No, the client was online
Does your heartbeat have a unique id so that you can tell for sure if 
the same
No, but the content of the message has a timestamp so I knew they were 
duplicates
published message is cloned many times (try a peek on the callback 
queue with jconsole)?

Can this be done with the admin messages


A final option is to use the current svn xmlBlaster and switch on the 
checkpoint logging

to get a better idea what is going on.
We will try this in house, unfortunately, the problem nodes are in a 
production environment.


And finally it could be a problem with your client not taking the 
callback messages.
Could be, but what I don't see is the queue gradually growing. Instead, 
it all-of-a-sudden appears to be full.


Another idea: The callback queue contains only a reference on the 
message.
If it expires the message-'meat' is destroyed but the reference 
remains in the queue
until it is looked at during delivery (and then thrown to garbage), 
Michele, could this be?


thanks
Marcel


David R Robison wrote:
We are experiencing something strange in xmlBlaster 1.6.1. We have 
two nodes, node A subscribes to messages from node B. These are 
heartbeat messages and are generated every 15 seconds with a lifetime 
of 30 seconds. A client connects to node A and subscribes to the 
messages, node A then passes the subscription onto node B. Watching 
the callback message queue, everything seems to run well, at most 1 
message in the queue waiting to be sent. It can run like this for 
days. Then, unexpectedly, the callback queue will show as being full 
(in this case 1001 messages). The queue contains many duplicated 
messages with different timestamps. From there, the server struggles 
to deliver the messages and keep the queue empty. The reader never 
seems to read enough messages to get the queue back down to zero. If 
I stop the client and reconnect, it will recreate its queue and be 
back to normal. I know this is a bit sketchy, but it is becoming a 
real problem for us.


Any thoughts on what might be the problem? Any idea of where to start 
looking?


One more note, when the client is subscribing to heartbeats that are 
generated on Node A, the client never fails in this manor, only when 
it is subscribing to node A for a message generated on node B.


Thanks, in advance,
David Robison





--

David R Robison
Open Roads Consulting, Inc.
708 S. Battlefield Blvd., Chesapeake, VA 23322
phone: (757) 546-3401
e-mail: [EMAIL PROTECTED]
web: http://openroadsconsulting.com
blog: http://therobe.blogspot.com
book: http://www.xulonpress.com/book_detail.php?id=2579