Hi Support

   I using pgagent over 2 years. There are over 30 jobs running by pgagent.  
Recently. I found a problem that sometime the pgagent hang by unknow reasons.
   From the stack information. Look like the pagent experience  dead-lock issue 
in code.
   The stack display many thread stop on this function “in __lll_lock_wait”
   If you need more information. Please let me know. I suspect this is a bug.

I collect to pgagent trace log and stack information on the attachment.

pgagent trace log
pg_agent_11_24.log
pg_agent_11_26.log
pgagent process stack
others information.



version:
pgagent_10-3.4.0-10.rhel6.x86_64
PG 10.5

The typical stack information.

[postgres@sltfjfrauxq pgagent_pd]$ cat 23389.stark.1
Thread 7 (Thread 0x7ff745f5c700 (LWP 906)):
#0  0x00007ff74b003334 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007ff74affe5d8 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x00007ff74affe4a7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007ff74ba979c9 in wxMutexInternal::Lock() () from 
/usr/lib64/libwx_baseu-2.8.so.0
#4  0x00007ff74c15b819 in DBconn::Return() ()
#5  0x00007ff74c161217 in Job::Execute() ()
#6  0x00007ff74c162899 in JobThread::Entry() ()
#7  0x00007ff74ba99021 in wxThreadInternal::PthreadStart(wxThread*) () from 
/usr/lib64/libwx_baseu-2.8.so.0
#8  0x00007ff74affcaa1 in start_thread () from /lib64/libpthread.so.0
#9  0x00007ff74ad49c4d in clone () from /lib64/libc.so.6
Thread 6 (Thread 0x7ff72ffff700 (LWP 908)):
#0  0x00007ff74b003334 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007ff74affe5d8 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x00007ff74affe4a7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007ff74ba979c9 in wxMutexInternal::Lock() () from 
/usr/lib64/libwx_baseu-2.8.so.0
#4  0x00007ff74c15b819 in DBconn::Return() ()
#5  0x00007ff74c161217 in Job::Execute() ()
#6  0x00007ff74c162899 in JobThread::Entry() ()
#7  0x00007ff74ba99021 in wxThreadInternal::PthreadStart(wxThread*) () from 
/usr/lib64/libwx_baseu-2.8.so.0
#8  0x00007ff74affcaa1 in start_thread () from /lib64/libpthread.so.0
#9  0x00007ff74ad49c4d in clone () from /lib64/libc.so.6
Thread 5 (Thread 0x7ff74695d700 (LWP 910)):
#0  0x00007ff74b003334 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007ff74affe5d8 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x00007ff74affe4a7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007ff74ba979c9 in wxMutexInternal::Lock() () from 
/usr/lib64/libwx_baseu-2.8.so.0
#4  0x00007ff74c15b819 in DBconn::Return() ()
#5  0x00007ff74c161217 in Job::Execute() ()
#6  0x00007ff74c162899 in JobThread::Entry() ()
#7  0x00007ff74ba99021 in wxThreadInternal::PthreadStart(wxThread*) () from 
/usr/lib64/libwx_baseu-2.8.so.0
#8  0x00007ff74affcaa1 in start_thread () from /lib64/libpthread.so.0
#9  0x00007ff74ad49c4d in clone () from /lib64/libc.so.6
Thread 4 (Thread 0x7ff74735e700 (LWP 1565)):
#0  0x00007ff74b003334 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007ff74affe5d8 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x00007ff74affe4a7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007ff74ba979c9 in wxMutexInternal::Lock() () from 
/usr/lib64/libwx_baseu-2.8.so.0
#4  0x00007ff74c15b819 in DBconn::Return() ()
#5  0x00007ff74c161217 in Job::Execute() ()
#6  0x00007ff74c162899 in JobThread::Entry() ()
#7  0x00007ff74ba99021 in wxThreadInternal::PthreadStart(wxThread*) () from 
/usr/lib64/libwx_baseu-2.8.so.0
#8  0x00007ff74affcaa1 in start_thread () from /lib64/libpthread.so.0
#9  0x00007ff74ad49c4d in clone () from /lib64/libc.so.6
Thread 3 (Thread 0x7ff74555b700 (LWP 1567)):
#0  0x00007ff74ad40403 in poll () from /lib64/libc.so.6
#1  0x00007ff74bd1c28f in ?? () from /usr/lib64/libpq.so.5
#2  0x00007ff74bd1c310 in ?? () from /usr/lib64/libpq.so.5
#3  0x00007ff74bd178e2 in ?? () from /usr/lib64/libpq.so.5
#4  0x00007ff74bd1865f in PQconnectdb () from /usr/lib64/libpq.so.5
#5  0x00007ff74c15ad71 in DBconn::Connect(wxString const&) ()
#6  0x00007ff74c15af73 in DBconn::DBconn(wxString const&, wxString const&) ()
#7  0x00007ff74c15bfe8 in DBconn::Get(wxString const&, wxString const&) ()
#8  0x00007ff74c16108f in Job::Execute() ()
#9  0x00007ff74c162899 in JobThread::Entry() ()
#10 0x00007ff74ba99021 in wxThreadInternal::PthreadStart(wxThread*) () from 
/usr/lib64/libwx_baseu-2.8.so.0
#11 0x00007ff74affcaa1 in start_thread () from /lib64/libpthread.so.0
#12 0x00007ff74ad49c4d in clone () from /lib64/libc.so.6
Thread 2 (Thread 0x7ff744b5a700 (LWP 1569)):
#0  0x00007ff74b003334 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007ff74affe5d8 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x00007ff74affe4a7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007ff74ba979c9 in wxMutexInternal::Lock() () from 
/usr/lib64/libwx_baseu-2.8.so.0
#4  0x00007ff74c15bf6b in DBconn::Get(wxString const&, wxString const&) ()
#5  0x00007ff74c16108f in Job::Execute() ()
#6  0x00007ff74c162899 in JobThread::Entry() ()
#7  0x00007ff74ba99021 in wxThreadInternal::PthreadStart(wxThread*) () from 
/usr/lib64/libwx_baseu-2.8.so.0
#8  0x00007ff74affcaa1 in start_thread () from /lib64/libpthread.so.0
#9  0x00007ff74ad49c4d in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7ff74c3507e0 (LWP 23389)):
#0  0x00007ff74b003334 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007ff74affe5d8 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x00007ff74affe4a7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007ff74ba979c9 in wxMutexInternal::Lock() () from 
/usr/lib64/libwx_baseu-2.8.so.0
#4  0x00007ff74c15a99d in DBconn::ClearConnections(bool) ()
#5  0x00007ff74c15e908 in MainRestartLoop(DBconn*) ()
#6  0x00007ff74c15f2a3 in MainLoop() ()
#7  0x00007ff74c15e016 in main ()


徐志宇(Jack)
Database Engineer

DB Team,ITS. Lenovo China
Phone: 86-18910860709
Email:xuz...@lenovo.com
No.6 Shangdi West Road, Haidian District Beijing, China, 100085

Attachment: pgagent_pd.tar.gz
Description: pgagent_pd.tar.gz

Reply via email to