On a related note, I suggest implementing the timeout as happening at a 
random time within a 2-3 minute window (e.g. at 2 minutes there is an 
increasing 25% chance of a reconnect, at 3 minutes it jumps up to 75%). 
 This should scatter reconnects during periods of high load.

A property I managed once had an issue where a link wasn't fronted through 
the CDN as intended, and publicly hit the origin servers [so 1-3 million 
requests within 30 minutes to a cluster of 4 nodes sharing a mysql master]. 
 There was a perfect storm of timing where many simultaneous timeouts + 
reconnects ended up ddos'ing the database and crashing mysql; the barrage 
of reconnects kept mysql from restarting correctly too.  Variants of this 
scenario are amazingly common in publishing, and a dozen major sites get 
blackouts from it each year.  Two sites I know of had links to their bare 
origin in server/cdn error pages too - so causing an error brought more 
damage.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Reply via email to