Ok, so we had a retry decorator around a group of ORM calls. This seems to be the issue. If we wrap each call instead, things are much better.
I wonder if the transaction rollback is causing issues. Karim On Monday, 2 March 2020 10:28:41 UTC-8, Karim Gillani wrote: > > Sadly we are still working on the ORM calls for the connect timeout. > Basic calls are working, of course ours are lot more complex so trying to > get one of the complex ones in the test script. > > At the same time, I am trying the pg8000 driver instead of psycopg2. That > driver seems to be much more promising. > > Very strange and the mystery continues.. > > Karim > > On Thursday, 27 February 2020 14:30:32 UTC-8, Karim Gillani wrote: >> >> Latest update: >> >> 1. Psycopyg2 works with connect_timeout - Direct SQL call >> 2. sqlalchemy works with connect_timeout (Direct SQL call no ORM) >> 3. Flask-sqlalchemy works with connect_timeout (Direct SQL call, no ORM) >> >> Next up on our troubleshooting tree.. >> >> Test sqlalchemy with ORM calls with Connect_timeout.. >> >> Weird... >> >> If anyone wants to look at our code to see what our app is doing: >> https://github.com/bcgov/queue-management/api >> >> Karim >> >> >> On Thursday, 27 February 2020 10:24:25 UTC-8, Mike Bayer wrote: >>> >>> OK so that script I gave you, the exact SQLAlchemy equivalent is: >>> >>> >>> from sqlalchemy import create_engine >>> >>> >>> e = create_engine("postgresql+psycopg2://scott:tiger@localhost/test", >>> connect_args={"connect_timeout": 2}) >>> c = e.connect() >>> >>> >>> it also will accept it as: >>> >>> e = >>> create_engine("postgresql+psycopg2://scott:tiger@localhost/test?connect_timeout=2") >>> >>> however, in the above case the timeout is passed as a string "2" and not >>> an integer, which may interfere with it working, im not sure. the >>> separate connect_args approach is safer. >>> >>> >>> >>> On Thu, Feb 27, 2020, at 12:57 PM, Karim Gillani wrote: >>> >>> Thank you for the great idea. I did not see the forest through the >>> trees. Haha. >>> >>> We have the pscopyg2 script running and things seems to be working fine >>> with that. We can see without the connect_timeout that the delay is about >>> 2 minutes and with the connect_timeout set, the delay is only 2 seconds. >>> >>> Next, we are going to wrap the script (if possible) with Alchemy only >>> and test. >>> Then after that, we will wrap flask-alchemy. >>> >>> Thanks >>> Karim >>> >>> >>> >>> On Thursday, 27 February 2020 06:48:00 UTC-8, Mike Bayer wrote: >>> >>> Can you write a script that uses psycopg2 directly and see if you can >>> replicate the issue in that way? this would elimiante SQLAlchemy as part >>> of the issue and you can interact with the psycopg2 developers directly: >>> >>> import psycopg2 >>> >>> connection = psycopg2.connect( >>> user="scott", >>> password="tiger", >>> host="localhost", >>> database="test", >>> connect_timeout=2, >>> ) >>> >>> >>> There's no need to do pdb inside of psycopg2 itself. Within SQLAlchemy >>> you could do this at lib/sqlalchemy/dialects/postgresql/psycopg2.py however >>> only if using raw psycopg2 above proves that it solves your connection >>> issue. >>> >>> >>> On Wed, Feb 26, 2020, at 11:19 AM, Karim Gillani wrote: >>> >>> Thanks. I don't believe this is a Red Hat issue as much as a >>> configuration issue. This is because I can replicate the issue using PSQL >>> and using the connect_timeout argument in PSQL, I can fix it. With the >>> complexity of using flask-sqlalchemy which uses sqlalchemy to pass >>> arguments to psycopg2 to libpq, it is difficult to see where the issue >>> actually is. Based on the links you provided, I am passing the >>> connect_timeout parameter correctly. I am not sure what to do now. I am >>> guessing I will need to somehow figure out if the parameter is being >>> passed. I think I will play with getting connection.get_dsn_parameters() >>> function to display the parameters. >>> >>> Karim >>> >>> On Wed, 26 Feb 2020 at 07:30, Mike Bayer <mik...@zzzcomputing.com> >>> wrote: >>> >>> >>> This would be a psycopg2 level setting which are documented at >>> https://www.psycopg.org/docs/module.html#psycopg2.connect and the >>> available values are ultimately at >>> https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-PARAMKEYWORDS >>> >>> . On the SQLAlchemy side you pass these to create_engine using >>> connect_args: >>> https://docs.sqlalchemy.org/en/13/core/engines.html?highlight=connect_args#custom-dbapi-args >>> >>> >>> >>> On Wed, Feb 26, 2020, at 12:11 AM, Karim Gillani wrote: >>> >>> We are using openshift which uses HAProxy. What I am looking for is why >>> the connect_timeout may not being applied so that we can retry the call >>> after re-connect to the other database pod. >>> >>> It is taking *up to two minutes* before I get >>> the psycopg2.OperationalError. It's very strange. The code will >>> automatically retry and work after this error shows up. I am guessing but >>> I don't think my setup is passing the connect_timeout to libpq. >>> >>> Thanks >>> >>> On Tuesday, 25 February 2020 10:37:19 UTC-8, Mike Bayer wrote: >>> >>> "could not connect" means your server is unreachable from a network >>> standpoint. You want to make sure the server is running and that it is >>> routable over the network you are trying to connect from. >>> >>> in an HA scenario, depending on what software is in use and how it does >>> switchover, there can be long network lags, such as if it is moving a VIP >>> from one machine to another. You'd want to be using a proxy server such as >>> HAProxy or PGBouncer rather than relying upon changing network routing. >>> >>> >>> >>> On Tue, Feb 25, 2020, at 12:07 PM, Karim Gillani wrote: >>> >>> Please note I am using a Pool. And some other connections seem to >>> recover fine. After two minutes I get this message and things start >>> working again: >>> >>> ==> AE K:11oh392; T:1; E:(psycopg2.OperationalError) could not connect >>> to server: Connection timed out> Is the server running on host >>> "patroni-mastertheq" (172.50.46.180) and accepting> TCP/IP connections on >>> port 5432?>>(Background on this error at: http://sqlalche.me/e/e3q8 ) >>> Thanks >>> Karim >>> >>> On Tuesday, 25 February 2020 08:49:19 UTC-8, Karim Gillani wrote: >>> >>> I am having an issue where when my HA postgres database drops and is >>> automatically moved to a backup database, the timeout doesn't seem to be >>> working. >>> >>> My config.py shows the following: >>> >>> SQLALCHEMY_DATABASE_URI = >>> '{engine}://{user}:{password}@{host}:{port}/{name}'.format( >>> >>> engine=DB_ENGINE, >>> >>> user=DB_USER, >>> >>> password=DB_PASSWORD, >>> >>> host=DB_HOST, >>> >>> port=DB_PORT, >>> >>> name=DB_NAME, >>> >>> ) >>> SQLALCHEMY_ENGINE_OPTIONS >>> = { 'pool_size' : pool_size, 'max_overflow' : max_overflow, >>> 'pool_pre_ping' : True, 'pool_timeout': DB_POOL_TIMEOUT, 'pool_recycle': >>> 3600, 'connect_args': { 'connect_timeout': 5, 'options' : '-c >>> statement_timeout=5000' } } >>> >>> Is there something I set wrong? My timeout seems to be about 2 minutes >>> instead of 5 secs. >>> >>> Any help would be appreciated, >>> >>> Thanks >>> Karim >>> >>> >>> -- >>> SQLAlchemy - >>> The Python SQL Toolkit and Object Relational Mapper >>> >>> http://www.sqlalchemy.org/ >>> >>> To post example code, please provide an MCVE: Minimal, Complete, and >>> Verifiable Example. See http://stackoverflow.com/help/mcve for a full >>> description. >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "sqlalchemy" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to sqlal...@googlegroups.com. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/sqlalchemy/ab3e5681-d9be-468f-8160-97bddcca57fb%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/sqlalchemy/ab3e5681-d9be-468f-8160-97bddcca57fb%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >>> >>> >>> -- >>> SQLAlchemy - >>> The Python SQL Toolkit and Object Relational Mapper >>> >>> http://www.sqlalchemy.org/ >>> >>> To post example code, please provide an MCVE: Minimal, Complete, and >>> Verifiable Example. See http://stackoverflow.com/help/mcve for a full >>> description. >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "sqlalchemy" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to sqlal...@googlegroups.com. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/sqlalchemy/06c8e91e-4f36-4192-90d0-acbe840478ef%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/sqlalchemy/06c8e91e-4f36-4192-90d0-acbe840478ef%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >>> >>> >>> -- >>> SQLAlchemy - >>> The Python SQL Toolkit and Object Relational Mapper >>> >>> http://www.sqlalchemy.org/ >>> >>> To post example code, please provide an MCVE: Minimal, Complete, and >>> Verifiable Example. See http://stackoverflow.com/help/mcve for a full >>> description. >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "sqlalchemy" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to sqlal...@googlegroups.com. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/sqlalchemy/82280db0-b63e-4e31-9657-66eb8bcd79b7%40www.fastmail.com >>> >>> <https://groups.google.com/d/msgid/sqlalchemy/82280db0-b63e-4e31-9657-66eb8bcd79b7%40www.fastmail.com?utm_medium=email&utm_source=footer> >>> . >>> >>> >>> -- >>> SQLAlchemy - >>> The Python SQL Toolkit and Object Relational Mapper >>> >>> http://www.sqlalchemy.org/ >>> >>> To post example code, please provide an MCVE: Minimal, Complete, and >>> Verifiable Example. See http://stackoverflow.com/help/mcve for a full >>> description. >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "sqlalchemy" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to sqlal...@googlegroups.com. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/sqlalchemy/CAOpL_VFw3U7vZhiLnQ8gB7-zUe%3Dc14QU019w9bQ25B4jPs6MvA%40mail.gmail.com >>> >>> <https://groups.google.com/d/msgid/sqlalchemy/CAOpL_VFw3U7vZhiLnQ8gB7-zUe%3Dc14QU019w9bQ25B4jPs6MvA%40mail.gmail.com?utm_medium=email&utm_source=footer> >>> . >>> >>> >>> >>> -- >>> SQLAlchemy - >>> The Python SQL Toolkit and Object Relational Mapper >>> >>> http://www.sqlalchemy.org/ >>> >>> To post example code, please provide an MCVE: Minimal, Complete, and >>> Verifiable Example. See http://stackoverflow.com/help/mcve for a full >>> description. >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "sqlalchemy" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to sqlal...@googlegroups.com. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/sqlalchemy/c9d76a8f-a4cb-4219-8994-9594b25a0084%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/sqlalchemy/c9d76a8f-a4cb-4219-8994-9594b25a0084%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >>> >>> -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/d7a59c9f-28aa-4707-91e7-3b8fd35cb00f%40googlegroups.com.