Thanks Michael, we will redo the test using psycopg2 directly and see what behavior we get there. Will report back here when done.
On Fri, Oct 17, 2014 at 6:03 PM, Michael Bayer <mike...@zzzcomputing.com> wrote: > > On Oct 17, 2014, at 4:47 AM, Jonas Oscarsson <jonas....@rsson.nu> wrote: > > Hi, > > We are running SQLAlchemy with a Multi-AZ Postgres RDS database on AWS and > are having trouble with database connections that hangs when failover > occurs. According to Amazon documentation for Multi-AZ RDS: > > When operations such as DB Instance scaling or system upgrades like OS >> patching are initiated for Multi-AZ deployments, for enhanced availability, >> they are applied first on the standby prior to an automatic failover. As a >> result, your availability impact is limited only to the time required for >> automatic failover to complete. > > > and > > When failing over, Amazon RDS simply flips the canonical name record >> (CNAME) for your DB Instance to point at the standby, which is in turn >> promoted to become the new primary. We encourage you to follow best >> practices and implement database connection retry at the application layer. > > > What we are wondering is what "best practices and implement database > connection retry" means when using SQLAlchemy? To test what happens during > a Mutli-AZ RDS failover, we created a new Multi-AZ Postgres instance and a > simple script that we run on a server on AWS (connecting through internal > network to the AWS RDS instance). The script queries a few rows and insert > a new row in a while-loop. > > Running this script and then rebooting the RDS (with flag "Reboot With > Failover?" ticked) *causes the script to hang* on session.commit(). When > rebooting the RDS instance **without** the "Reboot With Failover?" ticked, > the script outputs "could not connect to server: Connection refused" for a > few seconds, and then resumes as normal. > > We have tried: > 1. Using the QueuePool instead of the NullPool (we switched to NullPool > just to isolate the error). > 2. Setting socket.setdefaulttimeout() to a low value. > 3. Passing PGOPTIONS="-c statement_timeout=1000" to make the pgdriver > timeout. > 4. Using autocommmit=True on sessionmaker(). > > If we run lsof -i on the application server, we can see that the > connection closes after each iteration in the loop. When the script hangs, > we can see that there is one connection open which does not close, so the > issue seems to be a missing timeout somewhere. > > Do anyone know what could be causing this? > > > I’d note that “autocommit=True” on sessionmaker() does not change much at > all in the way the DBAPI is used. If you’re looking to use psycopg2’s > “autocommit” feature, I’d suggest experimenting first with raw psycopg2 so > that you can familiarize with how this behavior impacts what you’re doing > with AWS; the SQLAlchemy integration with this feature is documented at > http://docs.sqlalchemy.org/en/rel_0_9/dialects/postgresql.html#psycopg2-transaction-isolation-level > . > > Since you’re dealing with a system that hangs, it’s essential that you get > a handle on exactly what connectivity and transactional pattern you want to > see first at the DBAPI level, then we can talk about SQLAlchemy APIs to > make sure these patterns are used. > > > > -- > You received this message because you are subscribed to a topic in the > Google Groups "sqlalchemy" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/sqlalchemy/NANPH1QWgMM/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > sqlalchemy+unsubscr...@googlegroups.com. > To post to this group, send email to sqlalchemy@googlegroups.com. > Visit this group at http://groups.google.com/group/sqlalchemy. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.