Thanks for your reply. I'll do this the next time this happens (which will likely be within a few days based on history).
On Wed, Sep 14, 2011 at 3:57 PM, Tatsuo Ishii <is...@sraoss.co.jp> wrote: > Please use gdb. For example, > > become postgres user (or root user) > gdb pgpool 29191 > bt > cont > bt > cont > : > : > : > > This will give us an idea where it's looping. > -- > Tatsuo Ishii > SRA OSS, Inc. Japan > English: http://www.sraoss.co.jp/index_en.php > Japanese: http://www.sraoss.co.jp > >> This problem has returned yet again: >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >> 29191 postgres 20 0 80192 14m 1544 R 89.8 0.2 51:15.91 pgpool >> >> postgres 29191 3.4 0.1 80192 14728 ? R Sep13 51:40 >> pgpool: lfriedman nightly 10.31.96.84(61698) idle >> >> >> I'd really appreciate some input on how to debug this. >> >> >> On Fri, Sep 9, 2011 at 8:11 AM, Lonni J Friedman <netll...@gmail.com> wrote: >>> No one else has experienced this or has suggestions how to debug it? >>> >>> On Wed, Sep 7, 2011 at 12:49 PM, Lonni J Friedman <netll...@gmail.com> >>> wrote: >>>> Greetings, >>>> I'm running pgpool-3.0.4 on a Linux-x86_64 server serving as a load >>>> balancer for a three server postgresql-9.0.4 cluster (1 master, 2 >>>> standby). I'm seeing strange behavior where a single pgpool process >>>> seems to hang after some period of time, and then consume 100% of the >>>> CPU. I've seen this behavior happen twice since last Friday (when >>>> pgpool was brought online in my production environment). At the >>>> moment the current hung process looks like this in 'ps auxww' output: >>>> >>>> postgres 19838 98.7 0.0 68856 2904 ? R Sep06 1027:36 >>>> pgpool: lfriedman nightly 10.31.45.20(58277) idle >>>> >>>> >>>> In top, I see: >>>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >>>> 19838 postgres 20 0 68856 2904 1072 R 100.0 0.0 1027:29 pgpool >>>> >>>> >>>> When to connect to the process with strace, there is no output, so I'm >>>> guessing the process is stuck spinning somewhere: >>>> # strace -p 19838 >>>> Process 19838 attached - interrupt to quit >>>> ... >>>> ^CProcess 19838 detached >>>> >>>> One thing that i'm certain of is that the client IP (10.31.45.20) >>>> associated with the hung process has rebooted at least once since that >>>> process was spawned. So pgpool seems to be in some confused state, as >>>> the client definitely severed the connection already. I checked the >>>> pgpool log and there are no explicit references to PID 19838. I'm at >>>> a loss how to debug this further, but clearly something is wrong >>>> somewhere, and this isn't normal/expected behavior. _______________________________________________ Pgpool-general mailing list Pgpool-general@pgfoundry.org http://pgfoundry.org/mailman/listinfo/pgpool-general