I recently came across a bug report regarding configuring zeo clients storage to list multiple zeo servers here https://bugs.launchpad.net/zope2/+bug/143843 . I had not realized that was possible so I tried it by creating a second zeo server instance with a copy of the Data.fs from the first instance. I then added a second storage server like so: ... <zeoclient> server localhost:9997 server localhost:9998 ...
To my amazement, the client initially connected to localhost:9997 and when I shutdown that server, the client almost instantly connected to localhost:9998. I could continue switching them off and on and the client switches back and forth. I immediately realized that hot failover might be alot easier than I expected. However, with more testing I run into an issue in zodb.ZEO.ClientStorage.ClientStorage.verify_cache if a there are transactions recorded in the client cache that were not synced up in the secondary zeo server: elif server_tid < cache_tid: message = ("%s Client has seen newer transactions than server!" % self.__name__) logger.critical(message) raise ClientStorageError(message) would it be so bad to do something like the following?: elif server_tid < cache_tid: message = ("%s Client has seen newer transactions than server!" % self.__name__) logger.critical(message) self._cache.clear() raise ClientStorageError(message) So an error is raised and logged, but with the cache being cleared so that on the second try it reconnects? My rational for this change is that If your doing a hot failover that means that a) something bad has happened to main server and the recovery of those transactions probably won't happen any way or b) it happened during a maintenance window/the failover is happening for convenience and any difference between the two servers are probably minor, such as session data. With b. the server admin would be in the position to restore the main server anyways. Here is where the change took place http://svn.zope.org/ZODB/trunk/src/ZEO/ClientStorage.py?rev=93195&view=rev and I noticed that the rational was to handle 'an odd edge case' and in the tests the comment is that 'It is bad if a client has newer data than the server'. If the edge case makes this proposed change a bad idea, would it be reasonable to have the self._cache.clear call as a optional, configurable feature?
_______________________________________________ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev