Sorry for reply on this topic. We recently get the same problem in our production environment.
I found a patch in other lib [1], and they added conn.close() in exception psycopg2.ProgrammingError. Shall we do the same in [2] ? [1] https://github.com/aio-libs/aiopg/pull/415/files?diff=split [2] https://github.com/zzzeek/sqlalchemy/blob/master/lib/sqlalchemy/engine/base.py#L1289 在 2017年11月13日星期一 UTC+8上午10:44:31,JinRong Cai写道: > > Hi Michael , > > I am using openstack with postgresql which sqlalchemy and oslo_db module > were used. > And there are some problems after my pg database switched over. > Here is my switch over process: > 1. nova-conductor(python application) is running with DB connection > strings point to vip , which is in primary site(A) of pg. > 2. switch VIP from primary(A) to new primary(B) > 3. switch over pg: shutdown primary(A), promopt standby(B) to new primary. > 4. nova-conductor is running in the whole process. > > After some seconds, I found some nova-conductor processes are hang with > status futex_wait_queue_me, and the status of the query in DB is "idle in > transaction", the transaction was not commited or rollbacked! > I think disconnection was handled in the oslo_db, which will send a > ping(select 1) to DB. > > If DB was switchd over, the connection in the pool would be set with > status invalid, and reconnect after next check out. > > ###error messages from nova-conductor > localhost nova-conductor ERROR [pid:36365] [MainThread] [tid:122397712] > [exc_filters.py:330 _raise_for_remaining_DBAPIError] > [req-2bd8a290-e17b-4178-80a6-4b36d5793d85] DBAPIError exception wrapped > from (psycopg2.ProgrammingError) execute cannot be used while an > asynchronous query is underway [SQL: 'SELECT 1'] > 36365 ERROR oslo_db.sqlalchemy.exc_filters Traceback (most recent call > last): > 36365 ERROR oslo_db.sqlalchemy.exc_filters File > "/python2.7/site-packages/sqlalchemy/engine/base.py", line 1139, in > _execute_context > 36365 ERROR oslo_db.sqlalchemy.exc_filters context) > 36365 ERROR oslo_db.sqlalchemy.exc_filters File > "/python2.7/site-packages/sqlalchemy/engine/default.py", line 450, in > do_execute > 36365 ERROR oslo_db.sqlalchemy.exc_filters cursor.execute(statement, > parameters) > 36365 ERROR oslo_db.sqlalchemy.exc_filters ProgrammingError: execute > cannot be used while an asynchronous query is underway > 36365 ERROR oslo_db.sqlalchemy.exc_filters > localhost nova-conductor ERROR [pid:36365] [MainThread] [tid:122397712] > [log.py:122 error] [req-2bd8a290-e17b-4178-80a6-4b36d5793d85] Error closing > cursor > 36365 ERROR sqlalchemy.pool.QueuePool Traceback (most recent call last): > 36365 ERROR sqlalchemy.pool.QueuePool File > "/python2.7/site-packages/sqlalchemy/engine/base.py", line 1226, in > _safe_close_cursor > 36365 ERROR sqlalchemy.pool.QueuePool cursor.close() > 36365 ERROR sqlalchemy.pool.QueuePool ProgrammingError: close cannot be > used while an asynchronous query is underway > 36365 ERROR sqlalchemy.pool.QueuePool > > ###ps status of nova-conductor > POD6-Mongodb03:/var/log/uvp-getosstat/statistics20171106101500log # cat > /proc/33316/stack > [<ffffffff810e4c24>] futex_wait_queue_me+0xc4/0x120 > [<ffffffff810e5799>] futex_wait+0x179/0x280 > [<ffffffff810e782e>] do_futex+0xfe/0x5b0 > [<ffffffff810e7d60>] SyS_futex+0x80/0x180 > [<ffffffff81654e09>] system_call_fastpath+0x16/0x1b > [<ffffffffffffffff>] 0xffffffffffffffff > > ### stack of the nova-conductor process > POD6-Mongodb03:/tmp # pstack 33316 > #0 0x00002b8449e35f4d in __lll_lock_wait () from /lib64/libpthread.so.0 > #1 0x00002b8449e31d02 in _L_lock_791 () from /lib64/libpthread.so.0 > #2 0x00002b8449e31c08 in pthread_mutex_lock () from /lib64/libpthread.so.0 > #3 0x00002b84554c44ab in pq_abort () from > /python2.7/site-packages/psycopg2/_psycopg.so > #4 0x00002b84554c955e in psyco_conn_rollback () from > /python2.7/site-packages/psycopg2/_psycopg.so > #5 0x00002b8449b42b50 in PyEval_EvalFrameEx () from > /lib64/libpython2.7.so.1.0 > #6 0x00002b8449b42ad0 in PyEval_EvalFrameEx () from > /lib64/libpython2.7.so.1.0 > The psycopg2 was trying to close the cursor, and try to get the mutex lock > "pthread_mutex_lock", but it seems that the cursor was used by other > session. > > > Questions: > > 1. What the error "ProgrammingError: close cannot be used while an > asynchronous query is underway" mean? > AFAIK, these caused by psycopg2, which means a asynchronous query was > executed in one connection. > But the I think the sqlalchemy was thread safe since it was patched by > eventlet, see details in eventlet/support/psycopg2_patcher.py > And we can see different green thread number in the log, as: > [pid:36365] [MainThread] [tid:122397712] > [pid:36365] [MainThread] [tid:122397815] > So, I guess the connection pool in one process is safe. > > 2. The nova-conductor was a multi-thread python client, which forked > several child process. > ps -elf|grep -i nova-conductor > 30878 1 pool_s /usr/bin/nova-conductor > 36364 1 ep_pol /usr/bin/nova-conductor > 36365 1 futex_ /usr/bin/nova-conductor > 36366 1 ep_pol /usr/bin/nova-conductor > 36367 1 ep_pol /usr/bin/nova-conductor > 36368 1 ep_pol /usr/bin/nova-conductor > > If the nova-conductor was started with only one child, the problem was not > happen. > Does this mean the connection/engine CAN NOT shared in these child > processes? > > Thanks. > -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.