[jira] Commented: (AMQ-1350) JDBC master/slave does not work properly with datasources that can reconnect to the database

Mario Siegenthaler (JIRA) Sat, 28 Jul 2007 16:01:13 -0700

    [ 
https://issues.apache.org/activemq/browse/AMQ-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_39783
 ]


Mario Siegenthaler commented on AMQ-1350:
-----------------------------------------

Note that the patch will not try to reaquire the lock, it'll just check if 
nobody else holds the lock and shutdown in that case. We could also try to 
check if we still hold the lock and update it if neccessary. However I fear 
that doing a SELECT FOR UPDATE every x seconds will kill/slowdown the database 
because it'll result in 1000s of lock. Or does the database realize that we 
already have locked this thing and the statement is a no-op lockingwise? Also 
is there a portable way to check for a existing lock without being blocked in 
that case?

> JDBC master/slave does not work properly with datasources that can reconnect 
> to the database
> --------------------------------------------------------------------------------------------
>
>                 Key: AMQ-1350
>                 URL: https://issues.apache.org/activemq/browse/AMQ-1350
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Message Store
>    Affects Versions: 5.x
>         Environment: Linux x86_64, Sun jdk 1.6, Postgresql 8.2.4, c3p0 or 
> other pooling datasources
>            Reporter: Eric Anderson
>         Attachments: activemq-master-slave.patch
>
>
> This problem involves the JDBC master/slave configuration when the db server 
> is restarted, or when the brokers lose their JDBC connections for whatever 
> reason temporarily, and when a datasource is in use that can re-establish 
> stale connections prior to providing them to the broker.
> The problem lies with the JDBC locking strategy used to determine which 
> broker is master and which are slaves.  Let's say there are two brokers, a 
> master and a slave, and they've successfully initialized.  If you restart the 
> database server, the slave will throw an exception because it's just caught 
> an exception while blocked attempting to get the lock.  The slave will then 
> *retry* the process of getting a lock over and over again.  Now, since the 
> database was bounced, the *master* will have lost its lock in the 
> activemq_lock table.  However, with the current 4.x-5.x code, it will never 
> "know" that it has lost the lock.  There is no mechanism to check the lock 
> state.  So it will continue to think that it is the master and will leave all 
> of its network connectors active.
> When the slave tries to acquire the lock now, if the datasource has restored 
> connections to the now-restarted database server, it will succeed.  The slave 
> will come up as master, and there will be two masters active concurrently.  
> Both masters should at this point be fully-functional, as both will have 
> datasources that can talk to the database server once again.
> I have tested this with c3p0 and verified that I get two masters after 
> bouncing the database server.  If, at that point, I kill the original slave 
> broker, the original master still appears to be functioning normally.  If, 
> instead, I kill the original master broker, messages are still delivered via 
> the original slave (now co-master).  It does not seem to matter which broker 
> the clients connect to - both work.
> There is no workaround that I can think of that would function correctly 
> across multiple database bounces.  If a slave's datasource does not have the 
> functionality to do database reconnects, then, after the first database 
> server restart, it will never be able to establish a connection to the db 
> server in order to attempt to acquire the lock.  This, combined with the fact 
> that the JDBC master/slave topology does not have any favored brokers -- all 
> can be masters or slaves depending on start-up order and the failures that 
> have occurred over time, means that a datasource that can do reconnects is 
> required on all brokers.  Therefore it would seem that in the JDBC 
> masters/slave topology a database restart or temporary loss of database 
> connectivity will always result in multiple masters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (AMQ-1350) JDBC master/slave does not work properly with datasources that can reconnect to the database

Reply via email to