[jira] Commented: (AMQ-1350) JDBC master/slave does not work properly with datasources that can reconnect to the database

Mario Siegenthaler (JIRA) Sat, 28 Jul 2007 04:47:11 -0700

    [ 
https://issues.apache.org/activemq/browse/AMQ-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_39780
 ]


Mario Siegenthaler commented on AMQ-1350:
-----------------------------------------

We've also expired this behavior on a 4.1.1 master/slave configuration using 
SQL-Server. The master has somehow lost the lock during a database maintance 
operation (we suspect some DB-admin killed the lock in order to be able to 
backup the database) and we ended up with two masters.

> JDBC master/slave does not work properly with datasources that can reconnect 
> to the database
> --------------------------------------------------------------------------------------------
>
>                 Key: AMQ-1350
>                 URL: https://issues.apache.org/activemq/browse/AMQ-1350
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Message Store
>    Affects Versions: 5.x
>         Environment: Linux x86_64, Sun jdk 1.6, Postgresql 8.2.4, c3p0 or 
> other pooling datasources
>            Reporter: Eric Anderson
>
> This problem involves the JDBC master/slave configuration when the db server 
> is restarted, or when the brokers lose their JDBC connections for whatever 
> reason temporarily, and when a datasource is in use that can re-establish 
> stale connections prior to providing them to the broker.
> The problem lies with the JDBC locking strategy used to determine which 
> broker is master and which are slaves.  Let's say there are two brokers, a 
> master and a slave, and they've successfully initialized.  If you restart the 
> database server, the slave will throw an exception because it's just caught 
> an exception while blocked attempting to get the lock.  The slave will then 
> *retry* the process of getting a lock over and over again.  Now, since the 
> database was bounced, the *master* will have lost its lock in the 
> activemq_lock table.  However, with the current 4.x-5.x code, it will never 
> "know" that it has lost the lock.  There is no mechanism to check the lock 
> state.  So it will continue to think that it is the master and will leave all 
> of its network connectors active.
> When the slave tries to acquire the lock now, if the datasource has restored 
> connections to the now-restarted database server, it will succeed.  The slave 
> will come up as master, and there will be two masters active concurrently.  
> Both masters should at this point be fully-functional, as both will have 
> datasources that can talk to the database server once again.
> I have tested this with c3p0 and verified that I get two masters after 
> bouncing the database server.  If, at that point, I kill the original slave 
> broker, the original master still appears to be functioning normally.  If, 
> instead, I kill the original master broker, messages are still delivered via 
> the original slave (now co-master).  It does not seem to matter which broker 
> the clients connect to - both work.
> There is no workaround that I can think of that would function correctly 
> across multiple database bounces.  If a slave's datasource does not have the 
> functionality to do database reconnects, then, after the first database 
> server restart, it will never be able to establish a connection to the db 
> server in order to attempt to acquire the lock.  This, combined with the fact 
> that the JDBC master/slave topology does not have any favored brokers -- all 
> can be masters or slaves depending on start-up order and the failures that 
> have occurred over time, means that a datasource that can do reconnects is 
> required on all brokers.  Therefore it would seem that in the JDBC 
> masters/slave topology a database restart or temporary loss of database 
> connectivity will always result in multiple masters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (AMQ-1350) JDBC master/slave does not work properly with datasources that can reconnect to the database

Reply via email to