[jira] Updated: (AMQ-2102) Master/slave out of sync with multiple consumers
[ https://issues.apache.org/activemq/browse/AMQ-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Tully updated AMQ-2102: Attachment: AMQ2102.12-03.patch Yea Ying, I came to the same conclusion. The message needs to be pulled directly from the store when the pageSize is full and all subscriptions are full also. This patch is a tidier version of the original. Can you validate on your side. It does not need as many changes to MasterBroker async sends are they will impede performance. Also, there is a suppression of dispatchNotifications when a subscription is removed. > Master/slave out of sync with multiple consumers > > > Key: AMQ-2102 > URL: https://issues.apache.org/activemq/browse/AMQ-2102 > Project: ActiveMQ > Issue Type: Bug > Components: Broker >Affects Versions: 5.2.0 >Reporter: Dan James >Assignee: Gary Tully > Attachments: AMQ-2102-03102009.patch, AMQ2102.12-03.patch, > master.xml, MasterSlaveBug.java, MasterSlavePatch.patch, slave.xml, > slaveDispatchOnNotification.patch > > > I'm seeing exceptions like this in a simple master/slave setup: > ERROR Service- Async error occurred: > javax.jms.JMSException: Slave broker out of sync with master: Dispatched > message (ID:DUL1SJAMES-L2-1231-1233929569359-0:4:1:1:207) was not in the > pending list for MasterSlaveBug > javax.jms.JMSException: Slave broker out of sync with master: Dispatched > message (ID:DUL1SJAMES-L2-1231-1233929569359-0:4:1:1:207) was not in the > pending list for MasterSlaveBug > The problem only happens when there are multiple consumers listening to the > queue, and is more likely to occur as there are more consumers listening. > I've written a test program that demonstrates the problem. > I start the master and slave with an empty data directory and let them both > startup and settle. Then start the test program. The test program creates a > specified number of consumers, and then starts queuing 256 messages. The > consumers process the message by sending a reply. The producer counts the > replies. Both consumers and the producer see all the messages, but with > multiple consumers it is very likely that the error above will occur and > several of the messages will still be queued on the slave. > While debugging through the activemq code, I noticed that both the master and > the slave dispatch the message to a consumer's pending list independently. > In other words, it is possible that the master will add the message to > consumer A's pending list and the slave will add the message to consumer B's > pending list. Once the message has been processed by consumer A, the master > sends a message to the slaving which specifies consumer A so that the slave > can remove the message. The slave looks on its copy of consumer A's pending > list and cannot find the message. As a result, it throws this exception and > the message stays stuck on consumer B's pending list on the slave. > Master and slave configurations along with MasterSlaveBug.java are attached > to this issue. > Start master and slave brokers: > activemq xbean:master.xml > activemq xbean:slave.xml > Run with (only one consumer, the bug does not appear): >java -classpath .:activemq-all-5.2.0.jar MasterSlaveBug 1 > Run with (sixteen consumers, the bug does appear): >java -classpath .:activemq-all-5.2.0.jar MasterSlaveBug 16 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (AMQ-2102) Master/slave out of sync with multiple consumers
[ https://issues.apache.org/activemq/browse/AMQ-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ying updated AMQ-2102: -- Attachment: AMQ-2102-03102009.patch Attached AMQ-2102-03102009.patch seems fix all the slave broker out of synch issue. This is patch is created on activemq 5.2.0 official release tagged source. sorry that I run a tight schedule and don't have time to patch on trunk. One thing missed in this patch is: {{{ public MessageReference getMessage(MessageId id) throws IOException{ Message msg = this.store.getMessage(id); return msg; } }}} in org.apache.activemq.broker.region.cursors.QueueStorePrefetch.java Here is some explaination: 1. MasterBroker.java changes make sure that each command will arrive on the slave side in the same order. This is necessary because for each specific message, message itself, dispatch notification and message ack has to come in the exact order, otherwise, slave out of sync 2. Queue.java if dispatch notification comes and the messages still not in pending list or pagedIn, it goes to the store to grab the message and add to the pending then do processMessageDispatchNotification. This is necessary because that multiple consumers are involved (eg. dispatch notification for message 201 could come first while only 200 messages on slave side is pagedin or in the pending list) Please review it. I urgently need a review of this to make sure the changes are fine. A remaining issue which might or might related to this fix: I see "consumer stop consuming message" when large number of messages is produced and consumed and I have to restart the broker pair. otherwise, the brokers seems hanging. do you have any insight what might go wrong? > Master/slave out of sync with multiple consumers > > > Key: AMQ-2102 > URL: https://issues.apache.org/activemq/browse/AMQ-2102 > Project: ActiveMQ > Issue Type: Bug > Components: Broker >Affects Versions: 5.2.0 >Reporter: Dan James >Assignee: Gary Tully > Attachments: AMQ-2102-03102009.patch, master.xml, > MasterSlaveBug.java, MasterSlavePatch.patch, slave.xml, > slaveDispatchOnNotification.patch > > > I'm seeing exceptions like this in a simple master/slave setup: > ERROR Service- Async error occurred: > javax.jms.JMSException: Slave broker out of sync with master: Dispatched > message (ID:DUL1SJAMES-L2-1231-1233929569359-0:4:1:1:207) was not in the > pending list for MasterSlaveBug > javax.jms.JMSException: Slave broker out of sync with master: Dispatched > message (ID:DUL1SJAMES-L2-1231-1233929569359-0:4:1:1:207) was not in the > pending list for MasterSlaveBug > The problem only happens when there are multiple consumers listening to the > queue, and is more likely to occur as there are more consumers listening. > I've written a test program that demonstrates the problem. > I start the master and slave with an empty data directory and let them both > startup and settle. Then start the test program. The test program creates a > specified number of consumers, and then starts queuing 256 messages. The > consumers process the message by sending a reply. The producer counts the > replies. Both consumers and the producer see all the messages, but with > multiple consumers it is very likely that the error above will occur and > several of the messages will still be queued on the slave. > While debugging through the activemq code, I noticed that both the master and > the slave dispatch the message to a consumer's pending list independently. > In other words, it is possible that the master will add the message to > consumer A's pending list and the slave will add the message to consumer B's > pending list. Once the message has been processed by consumer A, the master > sends a message to the slaving which specifies consumer A so that the slave > can remove the message. The slave looks on its copy of consumer A's pending > list and cannot find the message. As a result, it throws this exception and > the message stays stuck on consumer B's pending list on the slave. > Master and slave configurations along with MasterSlaveBug.java are attached > to this issue. > Start master and slave brokers: > activemq xbean:master.xml > activemq xbean:slave.xml > Run with (only one consumer, the bug does not appear): >java -classpath .:activemq-all-5.2.0.jar MasterSlaveBug 1 > Run with (sixteen consumers, the bug does appear): >java -classpath .:activemq-all-5.2.0.jar MasterSlaveBug 16 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (AMQ-2102) Master/slave out of sync with multiple consumers
[ https://issues.apache.org/activemq/browse/AMQ-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Tully updated AMQ-2102: Attachment: slaveDispatchOnNotification.patch This patch defers dispatch on a slave til the dispatch notification so that the slave can honor the dispatch decision made on the master. However, if the master and slave are kept in sync wrt. consumer additions it is not needed. Adding it here as a place holder in case we still need to refactor in this manner. > Master/slave out of sync with multiple consumers > > > Key: AMQ-2102 > URL: https://issues.apache.org/activemq/browse/AMQ-2102 > Project: ActiveMQ > Issue Type: Bug > Components: Broker >Affects Versions: 5.2.0 >Reporter: Dan James >Assignee: Gary Tully > Attachments: master.xml, MasterSlaveBug.java, MasterSlavePatch.patch, > slave.xml, slaveDispatchOnNotification.patch > > > I'm seeing exceptions like this in a simple master/slave setup: > ERROR Service- Async error occurred: > javax.jms.JMSException: Slave broker out of sync with master: Dispatched > message (ID:DUL1SJAMES-L2-1231-1233929569359-0:4:1:1:207) was not in the > pending list for MasterSlaveBug > javax.jms.JMSException: Slave broker out of sync with master: Dispatched > message (ID:DUL1SJAMES-L2-1231-1233929569359-0:4:1:1:207) was not in the > pending list for MasterSlaveBug > The problem only happens when there are multiple consumers listening to the > queue, and is more likely to occur as there are more consumers listening. > I've written a test program that demonstrates the problem. > I start the master and slave with an empty data directory and let them both > startup and settle. Then start the test program. The test program creates a > specified number of consumers, and then starts queuing 256 messages. The > consumers process the message by sending a reply. The producer counts the > replies. Both consumers and the producer see all the messages, but with > multiple consumers it is very likely that the error above will occur and > several of the messages will still be queued on the slave. > While debugging through the activemq code, I noticed that both the master and > the slave dispatch the message to a consumer's pending list independently. > In other words, it is possible that the master will add the message to > consumer A's pending list and the slave will add the message to consumer B's > pending list. Once the message has been processed by consumer A, the master > sends a message to the slaving which specifies consumer A so that the slave > can remove the message. The slave looks on its copy of consumer A's pending > list and cannot find the message. As a result, it throws this exception and > the message stays stuck on consumer B's pending list on the slave. > Master and slave configurations along with MasterSlaveBug.java are attached > to this issue. > Start master and slave brokers: > activemq xbean:master.xml > activemq xbean:slave.xml > Run with (only one consumer, the bug does not appear): >java -classpath .:activemq-all-5.2.0.jar MasterSlaveBug 1 > Run with (sixteen consumers, the bug does appear): >java -classpath .:activemq-all-5.2.0.jar MasterSlaveBug 16 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (AMQ-2102) Master/slave out of sync with multiple consumers
[ https://issues.apache.org/activemq/browse/AMQ-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ying updated AMQ-2102: -- Attachment: MasterSlavePatch.patch > Master/slave out of sync with multiple consumers > > > Key: AMQ-2102 > URL: https://issues.apache.org/activemq/browse/AMQ-2102 > Project: ActiveMQ > Issue Type: Bug > Components: Broker >Affects Versions: 5.2.0 >Reporter: Dan James >Assignee: Gary Tully > Attachments: master.xml, MasterSlaveBug.java, MasterSlavePatch.patch, > slave.xml > > > I'm seeing exceptions like this in a simple master/slave setup: > ERROR Service- Async error occurred: > javax.jms.JMSException: Slave broker out of sync with master: Dispatched > message (ID:DUL1SJAMES-L2-1231-1233929569359-0:4:1:1:207) was not in the > pending list for MasterSlaveBug > javax.jms.JMSException: Slave broker out of sync with master: Dispatched > message (ID:DUL1SJAMES-L2-1231-1233929569359-0:4:1:1:207) was not in the > pending list for MasterSlaveBug > The problem only happens when there are multiple consumers listening to the > queue, and is more likely to occur as there are more consumers listening. > I've written a test program that demonstrates the problem. > I start the master and slave with an empty data directory and let them both > startup and settle. Then start the test program. The test program creates a > specified number of consumers, and then starts queuing 256 messages. The > consumers process the message by sending a reply. The producer counts the > replies. Both consumers and the producer see all the messages, but with > multiple consumers it is very likely that the error above will occur and > several of the messages will still be queued on the slave. > While debugging through the activemq code, I noticed that both the master and > the slave dispatch the message to a consumer's pending list independently. > In other words, it is possible that the master will add the message to > consumer A's pending list and the slave will add the message to consumer B's > pending list. Once the message has been processed by consumer A, the master > sends a message to the slaving which specifies consumer A so that the slave > can remove the message. The slave looks on its copy of consumer A's pending > list and cannot find the message. As a result, it throws this exception and > the message stays stuck on consumer B's pending list on the slave. > Master and slave configurations along with MasterSlaveBug.java are attached > to this issue. > Start master and slave brokers: > activemq xbean:master.xml > activemq xbean:slave.xml > Run with (only one consumer, the bug does not appear): >java -classpath .:activemq-all-5.2.0.jar MasterSlaveBug 1 > Run with (sixteen consumers, the bug does appear): >java -classpath .:activemq-all-5.2.0.jar MasterSlaveBug 16 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (AMQ-2102) Master/slave out of sync with multiple consumers
[ https://issues.apache.org/activemq/browse/AMQ-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan James updated AMQ-2102: --- Attachment: master.xml > Master/slave out of sync with multiple consumers > > > Key: AMQ-2102 > URL: https://issues.apache.org/activemq/browse/AMQ-2102 > Project: ActiveMQ > Issue Type: Bug > Components: Broker >Affects Versions: 5.2.0 >Reporter: Dan James > Attachments: master.xml, MasterSlaveBug.java, slave.xml > > > I'm seeing exceptions like this in a simple master/slave setup: > ERROR Service- Async error occurred: > javax.jms.JMSException: Slave broker out of sync with master: Dispatched > message (ID:DUL1SJAMES-L2-1231-1233929569359-0:4:1:1:207) was not in the > pending list for MasterSlaveBug > javax.jms.JMSException: Slave broker out of sync with master: Dispatched > message (ID:DUL1SJAMES-L2-1231-1233929569359-0:4:1:1:207) was not in the > pending list for MasterSlaveBug > The problem only happens when there are multiple consumers listening to the > queue, and is more likely to occur as there are more consumers listening. > I've written a test program that demonstrates the problem. > I start the master and slave with an empty data directory and let them both > startup and settle. Then start the test program. The test program creates a > specified number of consumers, and then starts queuing 256 messages. The > consumers process the message by sending a reply. The producer counts the > replies. Both consumers and the producer see all the messages, but with > multiple consumers it is very likely that the error above will occur and > several of the messages will still be queued on the slave. > While debugging through the activemq code, I noticed that both the master and > the slave dispatch the message to a consumer's pending list independently. > In other words, it is possible that the master will add the message to > consumer A's pending list and the slave will add the message to consumer B's > pending list. Once the message has been processed by consumer A, the master > sends a message to the slaving which specifies consumer A so that the slave > can remove the message. The slave looks on its copy of consumer A's pending > list and cannot find the message. As a result, it throws this exception and > the message stays stuck on consumer B's pending list on the slave. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (AMQ-2102) Master/slave out of sync with multiple consumers
[ https://issues.apache.org/activemq/browse/AMQ-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan James updated AMQ-2102: --- Attachment: slave.xml > Master/slave out of sync with multiple consumers > > > Key: AMQ-2102 > URL: https://issues.apache.org/activemq/browse/AMQ-2102 > Project: ActiveMQ > Issue Type: Bug > Components: Broker >Affects Versions: 5.2.0 >Reporter: Dan James > Attachments: master.xml, MasterSlaveBug.java, slave.xml > > > I'm seeing exceptions like this in a simple master/slave setup: > ERROR Service- Async error occurred: > javax.jms.JMSException: Slave broker out of sync with master: Dispatched > message (ID:DUL1SJAMES-L2-1231-1233929569359-0:4:1:1:207) was not in the > pending list for MasterSlaveBug > javax.jms.JMSException: Slave broker out of sync with master: Dispatched > message (ID:DUL1SJAMES-L2-1231-1233929569359-0:4:1:1:207) was not in the > pending list for MasterSlaveBug > The problem only happens when there are multiple consumers listening to the > queue, and is more likely to occur as there are more consumers listening. > I've written a test program that demonstrates the problem. > I start the master and slave with an empty data directory and let them both > startup and settle. Then start the test program. The test program creates a > specified number of consumers, and then starts queuing 256 messages. The > consumers process the message by sending a reply. The producer counts the > replies. Both consumers and the producer see all the messages, but with > multiple consumers it is very likely that the error above will occur and > several of the messages will still be queued on the slave. > While debugging through the activemq code, I noticed that both the master and > the slave dispatch the message to a consumer's pending list independently. > In other words, it is possible that the master will add the message to > consumer A's pending list and the slave will add the message to consumer B's > pending list. Once the message has been processed by consumer A, the master > sends a message to the slaving which specifies consumer A so that the slave > can remove the message. The slave looks on its copy of consumer A's pending > list and cannot find the message. As a result, it throws this exception and > the message stays stuck on consumer B's pending list on the slave. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (AMQ-2102) Master/slave out of sync with multiple consumers
[ https://issues.apache.org/activemq/browse/AMQ-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan James updated AMQ-2102: --- Description: I'm seeing exceptions like this in a simple master/slave setup: ERROR Service- Async error occurred: javax.jms.JMSException: Slave broker out of sync with master: Dispatched message (ID:DUL1SJAMES-L2-1231-1233929569359-0:4:1:1:207) was not in the pending list for MasterSlaveBug javax.jms.JMSException: Slave broker out of sync with master: Dispatched message (ID:DUL1SJAMES-L2-1231-1233929569359-0:4:1:1:207) was not in the pending list for MasterSlaveBug The problem only happens when there are multiple consumers listening to the queue, and is more likely to occur as there are more consumers listening. I've written a test program that demonstrates the problem. I start the master and slave with an empty data directory and let them both startup and settle. Then start the test program. The test program creates a specified number of consumers, and then starts queuing 256 messages. The consumers process the message by sending a reply. The producer counts the replies. Both consumers and the producer see all the messages, but with multiple consumers it is very likely that the error above will occur and several of the messages will still be queued on the slave. While debugging through the activemq code, I noticed that both the master and the slave dispatch the message to a consumer's pending list independently. In other words, it is possible that the master will add the message to consumer A's pending list and the slave will add the message to consumer B's pending list. Once the message has been processed by consumer A, the master sends a message to the slaving which specifies consumer A so that the slave can remove the message. The slave looks on its copy of consumer A's pending list and cannot find the message. As a result, it throws this exception and the message stays stuck on consumer B's pending list on the slave. Master and slave configurations along with MasterSlaveBug.java are attached to this issue. Start master and slave brokers: activemq xbean:master.xml activemq xbean:slave.xml Run with (only one consumer, the bug does not appear): java -classpath .:activemq-all-5.2.0.jar MasterSlaveBug 1 Run with (sixteen consumers, the bug does appear): java -classpath .:activemq-all-5.2.0.jar MasterSlaveBug 16 was: I'm seeing exceptions like this in a simple master/slave setup: ERROR Service- Async error occurred: javax.jms.JMSException: Slave broker out of sync with master: Dispatched message (ID:DUL1SJAMES-L2-1231-1233929569359-0:4:1:1:207) was not in the pending list for MasterSlaveBug javax.jms.JMSException: Slave broker out of sync with master: Dispatched message (ID:DUL1SJAMES-L2-1231-1233929569359-0:4:1:1:207) was not in the pending list for MasterSlaveBug The problem only happens when there are multiple consumers listening to the queue, and is more likely to occur as there are more consumers listening. I've written a test program that demonstrates the problem. I start the master and slave with an empty data directory and let them both startup and settle. Then start the test program. The test program creates a specified number of consumers, and then starts queuing 256 messages. The consumers process the message by sending a reply. The producer counts the replies. Both consumers and the producer see all the messages, but with multiple consumers it is very likely that the error above will occur and several of the messages will still be queued on the slave. While debugging through the activemq code, I noticed that both the master and the slave dispatch the message to a consumer's pending list independently. In other words, it is possible that the master will add the message to consumer A's pending list and the slave will add the message to consumer B's pending list. Once the message has been processed by consumer A, the master sends a message to the slaving which specifies consumer A so that the slave can remove the message. The slave looks on its copy of consumer A's pending list and cannot find the message. As a result, it throws this exception and the message stays stuck on consumer B's pending list on the slave. > Master/slave out of sync with multiple consumers > > > Key: AMQ-2102 > URL: https://issues.apache.org/activemq/browse/AMQ-2102 > Project: ActiveMQ > Issue Type: Bug > Components: Broker >Affects Versions: 5.2.0 >Reporter: Dan James > Attachments: master.xml, MasterSlaveBug.java, slave.xml > > > I'm seeing exceptions like this in a simple master/slave setup: > ERROR Service- Async error occurred: > javax.jms.JMS
[jira] Updated: (AMQ-2102) Master/slave out of sync with multiple consumers
[ https://issues.apache.org/activemq/browse/AMQ-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan James updated AMQ-2102: --- Attachment: MasterSlaveBug.java > Master/slave out of sync with multiple consumers > > > Key: AMQ-2102 > URL: https://issues.apache.org/activemq/browse/AMQ-2102 > Project: ActiveMQ > Issue Type: Bug > Components: Broker >Affects Versions: 5.2.0 >Reporter: Dan James > Attachments: MasterSlaveBug.java > > > I'm seeing exceptions like this in a simple master/slave setup: > ERROR Service- Async error occurred: > javax.jms.JMSException: Slave broker out of sync with master: Dispatched > message (ID:DUL1SJAMES-L2-1231-1233929569359-0:4:1:1:207) was not in the > pending list for MasterSlaveBug > javax.jms.JMSException: Slave broker out of sync with master: Dispatched > message (ID:DUL1SJAMES-L2-1231-1233929569359-0:4:1:1:207) was not in the > pending list for MasterSlaveBug > The problem only happens when there are multiple consumers listening to the > queue, and is more likely to occur as there are more consumers listening. > I've written a test program that demonstrates the problem. > I start the master and slave with an empty data directory and let them both > startup and settle. Then start the test program. The test program creates a > specified number of consumers, and then starts queuing 256 messages. The > consumers process the message by sending a reply. The producer counts the > replies. Both consumers and the producer see all the messages, but with > multiple consumers it is very likely that the error above will occur and > several of the messages will still be queued on the slave. > While debugging through the activemq code, I noticed that both the master and > the slave dispatch the message to a consumer's pending list independently. > In other words, it is possible that the master will add the message to > consumer A's pending list and the slave will add the message to consumer B's > pending list. Once the message has been processed by consumer A, the master > sends a message to the slaving which specifies consumer A so that the slave > can remove the message. The slave looks on its copy of consumer A's pending > list and cannot find the message. As a result, it throws this exception and > the message stays stuck on consumer B's pending list on the slave. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.