[ https://issues.apache.org/activemq/browse/AMQ-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_40553 ]
Manish Bellani commented on AMQ-1350: ------------------------------------- How About using DistributedLock from Jgroups to make the master/slave work? or something similar. > JDBC master/slave does not work properly with datasources that can reconnect > to the database > -------------------------------------------------------------------------------------------- > > Key: AMQ-1350 > URL: https://issues.apache.org/activemq/browse/AMQ-1350 > Project: ActiveMQ > Issue Type: Bug > Components: Message Store > Affects Versions: 5.2.0 > Environment: Linux x86_64, Sun jdk 1.6, Postgresql 8.2.4, c3p0 or > other pooling datasources > Reporter: Eric Anderson > Fix For: 5.2.0 > > Attachments: activemq-master-slave.patch > > > This problem involves the JDBC master/slave configuration when the db server > is restarted, or when the brokers lose their JDBC connections for whatever > reason temporarily, and when a datasource is in use that can re-establish > stale connections prior to providing them to the broker. > The problem lies with the JDBC locking strategy used to determine which > broker is master and which are slaves. Let's say there are two brokers, a > master and a slave, and they've successfully initialized. If you restart the > database server, the slave will throw an exception because it's just caught > an exception while blocked attempting to get the lock. The slave will then > *retry* the process of getting a lock over and over again. Now, since the > database was bounced, the *master* will have lost its lock in the > activemq_lock table. However, with the current 4.x-5.x code, it will never > "know" that it has lost the lock. There is no mechanism to check the lock > state. So it will continue to think that it is the master and will leave all > of its network connectors active. > When the slave tries to acquire the lock now, if the datasource has restored > connections to the now-restarted database server, it will succeed. The slave > will come up as master, and there will be two masters active concurrently. > Both masters should at this point be fully-functional, as both will have > datasources that can talk to the database server once again. > I have tested this with c3p0 and verified that I get two masters after > bouncing the database server. If, at that point, I kill the original slave > broker, the original master still appears to be functioning normally. If, > instead, I kill the original master broker, messages are still delivered via > the original slave (now co-master). It does not seem to matter which broker > the clients connect to - both work. > There is no workaround that I can think of that would function correctly > across multiple database bounces. If a slave's datasource does not have the > functionality to do database reconnects, then, after the first database > server restart, it will never be able to establish a connection to the db > server in order to attempt to acquire the lock. This, combined with the fact > that the JDBC master/slave topology does not have any favored brokers -- all > can be masters or slaves depending on start-up order and the failures that > have occurred over time, means that a datasource that can do reconnects is > required on all brokers. Therefore it would seem that in the JDBC > masters/slave topology a database restart or temporary loss of database > connectivity will always result in multiple masters. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.