Looks like forking from a thread causes other issues. I think I've resolved the hang in psycopg2 by creating a brand new engine in the forked subprocess, but now I'm getting occasional hangs here:
#1 Waiting for a lock (e.g. GIL) #2 <built-in method acquire of thread.lock object at remote 0x7fc061c17b58> #4 file '/usr/lib64/python2.6/site-packages/sqlalchemy/util/langhelpers.py', in '_next' #8 file '/usr/lib64/python2.6/site-packages/sqlalchemy/orm/loading.py', in 'instances' #16 file '/usr/lib64/python2.6/site-packages/sqlalchemy/orm/query.py', in '__getitem__' #24 file '/usr/lib64/python2.6/site-packages/sqlalchemy/orm/query.py', in 'first' At this point: def counter(): """Return a threadsafe counter function.""" lock = compat.threading.Lock() counter = itertools.count(1) # avoid the 2to3 "next" transformation... def _next(): lock.acquire() try: return next(counter) finally: lock.release() return _next Not sure a process pool will solve the issue since I can't necessarily tell the volume of concurrent queries beforehand. I'm wondering if there's a way to clear the lock manually in the forked process, or somehow protect this section when forking. Either one is probably a monkey patch... On Wednesday, February 17, 2016 at 10:41:00 AM UTC-8, Mike Bayer wrote: > > > > On 02/17/2016 11:33 AM, Uri Okrent wrote: > > Maybe this is a psycopg question and if so please say so. > > > > I have a multi-threaded server which maintains a thread-pool (and a > > corresponding connection pool) for servicing requests. In order to > > mitigate python's high-water-mark memory usage behavior for large > > queries, I'm attempting to handle queries in particular using a forked > > subprocess from the request thread. > > > > I'm using the connection invalidation recipe described here (the second > > one that adds listeners to the Pool): > > > http://docs.sqlalchemy.org/en/latest/core/pooling.html#using-connection-pools-with-multiprocessing > > > > > It seems to be working correctly -- that is, I can see that the child > > process is indeed creating a new connection. However, I'm still > > experiencing intermittent hangs in the child process during connection > > creation. I've gotten a stack trace using gdb, and I think I understand > > what is going on but I'm not sure how to protect the critical section. > > > > It looks like threads creating connections in the parent process acquire > > some threading synchronization primitive inside psycopg's _connect > > function (that's in c so I didn't see the actual source). This > > apparently occurs occasionally at the same time as the fork, so that the > > child process never sees the primitive release in the parent process and > > hangs forever. Interestingly, hangs stop after the server has been > > running for a while, presumably because the parent process is warmed up > > and has a full connection pool, and is no longer creating connections. > > > > Here is my stack on a hung process: > > #17 <built-in function _connect> > > #19 file '/usr/lib64/python2.6/site-packages/psycopg2/__init__.py', in > > 'connect' > > #24 file > > '/usr/lib64/python2.6/site-packages/sqlalchemy/engine/default.py', in > > 'connect' > > #29 file > > '/usr/lib64/python2.6/site-packages/sqlalchemy/engine/strategies.py', in > > 'connect' > > #33 file '/usr/lib64/python2.6/site-packages/sqlalchemy/pool.py', in > > '__connect' > > #36 file '/usr/lib64/python2.6/site-packages/sqlalchemy/pool.py', in > > 'get_connection' > > #39 file '/usr/lib64/python2.6/site-packages/sqlalchemy/pool.py', in > > 'checkout' > > #43 file '/usr/lib64/python2.6/site-packages/sqlalchemy/pool.py', in > > '_checkout' > > #47 file '/usr/lib64/python2.6/site-packages/sqlalchemy/pool.py', in > > 'connect' > > #50 file '/usr/lib64/python2.6/site-packages/sqlalchemy/engine/base.py', > > in '_wrap_pool_connect' > > #54 file '/usr/lib64/python2.6/site-packages/sqlalchemy/engine/base.py', > > in 'contextual_connect' > > #58 file '/usr/lib64/python2.6/site-packages/sqlalchemy/orm/session.py', > > in '_connection_for_bind' > > #61 file '/usr/lib64/python2.6/site-packages/sqlalchemy/orm/session.py', > > in '_connection_for_bind' > > #65 file '/usr/lib64/python2.6/site-packages/sqlalchemy/orm/session.py', > > in 'connection' > > #70 file '/usr/lib64/python2.6/site-packages/sqlalchemy/orm/query.py', > > in '_connection_from_session' > > #74 file '/usr/lib64/python2.6/site-packages/sqlalchemy/orm/query.py', > > in '_execute_and_instances' > > #77 file '/usr/lib64/python2.6/site-packages/sqlalchemy/orm/query.py', > > in '__iter__' > > #91 file '/usr/lib64/python2.6/site-packages/sqlalchemy/orm/query.py', > > in '__getitem__' > > #99 file '/usr/lib64/python2.6/site-packages/sqlalchemy/orm/query.py', > > in 'first' > > > > I'm using sqlalchemy 1.0.12 and psycopg 2.5.3 > > > > My quick and dirty fix would be to fill the connection pool in the > > parent process by force before servicing requests, > > well I'd not want to transfer a psycopg2 connection from a parent to a > child fork, because now that same filehandle is in both processes and > you'll get unsafe concurrent access on it. > > I've used multiprocessing with psycopg2 for years in a wide variety of > scenarios and I've never seen it hanging on the actual psycopg2.connect > call. But perhaps that's because I've never called fork() from inside > a thread that is not the main thread - if that is what's triggering it > here, I'd use a pattern such as a process pool or similar where the > forking is done from the main thread ahead of time. > > > > > but that is a hack, > > and in case of an invalidated connection the server would be susceptible > > to the issue again while recreating the invalid connection in the parent > > process. > > I apparently need to synchronize my fork in one thread with connections > > being created in others, but I'm not sure how to do that. Any pointers > > would be great. > > > > TIA, > > Uri > > > > -- > > You received this message because you are subscribed to the Google > > Groups "sqlalchemy" group. > > To unsubscribe from this group and stop receiving emails from it, send > > an email to sqlalchemy+...@googlegroups.com <javascript:> > > <mailto:sqlalchemy+unsubscr...@googlegroups.com <javascript:>>. > > To post to this group, send email to sqlal...@googlegroups.com > <javascript:> > > <mailto:sqlal...@googlegroups.com <javascript:>>. > > Visit this group at https://groups.google.com/group/sqlalchemy. > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.