I'm thinking about error handling for HA backups, interested in opinions about what's the right way to go.
Presently if a backup encounters an exception while replicating, it shuts down. That avoids possible spinning on trying to re-connect and allows a new and hopefully error free replica to be started. Is there a softer option? In theory it would be possible for a broker to try to just reset and re-start replication on queue that failed. Is that desirable or does it just mask the fact that something has gone wrong? A replica doing such a restart is no longer "ready" in that there are messages in the restarted queue that have not been delayed so fail-over to this backup before it catches up could lose messages. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
