On a related note, I suggest implementing the timeout as happening at a random time within a 2-3 minute window (e.g. at 2 minutes there is an increasing 25% chance of a reconnect, at 3 minutes it jumps up to 75%). This should scatter reconnects during periods of high load.
A property I managed once had an issue where a link wasn't fronted through the CDN as intended, and publicly hit the origin servers [so 1-3 million requests within 30 minutes to a cluster of 4 nodes sharing a mysql master]. There was a perfect storm of timing where many simultaneous timeouts + reconnects ended up ddos'ing the database and crashing mysql; the barrage of reconnects kept mysql from restarting correctly too. Variants of this scenario are amazingly common in publishing, and a dozen major sites get blackouts from it each year. Two sites I know of had links to their bare origin in server/cdn error pages too - so causing an error brought more damage. -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.