Re: Pg stuck at 100% cpu, for multiple days

2021-08-31 Thread hubert depesz lubaczewski
On Tue, Aug 31, 2021 at 04:00:14PM +0530, Amit Kapila wrote: > One possibility could be there are quite a few DDLs happening in this > application at some particular point in time which can lead to high While not impossible, I'd rather say it's not very likely. We don't use temporary tables, and

Re: Pg stuck at 100% cpu, for multiple days

2021-08-31 Thread Amit Kapila
On Tue, Aug 31, 2021 at 11:41 AM hubert depesz lubaczewski wrote: > > On Mon, Aug 30, 2021 at 09:09:20PM +0200, Laurenz Albe wrote: > > On Mon, 2021-08-30 at 17:18 +0200, hubert depesz lubaczewski wrote: > > > The thing is - I can't close it with pg_terminate_backend(), and I'd > > > rather not

Re: Pg stuck at 100% cpu, for multiple days

2021-08-31 Thread hubert depesz lubaczewski
On Mon, Aug 30, 2021 at 08:15:24PM -0400, Joe Conway wrote: > It would be interesting to step through a few times to see if it is really > stuck in that loop. That would be consistent with 100% CPU and not checking > for interrupts I think. If the problem will happen again, will do my best to get

Re: Pg stuck at 100% cpu, for multiple days

2021-08-31 Thread hubert depesz lubaczewski
On Mon, Aug 30, 2021 at 09:09:20PM +0200, Laurenz Albe wrote: > On Mon, 2021-08-30 at 17:18 +0200, hubert depesz lubaczewski wrote: > > The thing is - I can't close it with pg_terminate_backend(), and I'd > > rather not kill -9, as it will, I think, close all other connections, > > and this is

Re: Pg stuck at 100% cpu, for multiple days

2021-08-30 Thread Michael Paquier
On Mon, Aug 30, 2021 at 09:16:51PM -0400, Joe Conway wrote: > On 8/30/21 8:22 PM, Tom Lane wrote: >> Yeah, this single data point is not enough justification to blame >> dynahash.c (which is *extremely* battle-tested code, you'll recall). >> I'm inclined to guess that the looping is happening a

Re: Pg stuck at 100% cpu, for multiple days

2021-08-30 Thread Joe Conway
On 8/30/21 8:22 PM, Tom Lane wrote: Joe Conway writes: It would be interesting to step through a few times to see if it is really stuck in that loop. Yeah, this single data point is not enough justification to blame dynahash.c (which is *extremely* battle-tested code, you'll recall). I'm

Re: Pg stuck at 100% cpu, for multiple days

2021-08-30 Thread Tom Lane
Joe Conway writes: > It would be interesting to step through a few times to see if it is > really stuck in that loop. Yeah, this single data point is not enough justification to blame dynahash.c (which is *extremely* battle-tested code, you'll recall). I'm inclined to guess that the looping is

Re: Pg stuck at 100% cpu, for multiple days

2021-08-30 Thread Joe Conway
On 8/30/21 3:34 PM, Justin Pryzby wrote: On Mon, Aug 30, 2021 at 09:09:20PM +0200, Laurenz Albe wrote: On Mon, 2021-08-30 at 17:18 +0200, hubert depesz lubaczewski wrote: > The thing is - I can't close it with pg_terminate_backend(), and I'd > rather not kill -9, as it will, I think, close all

Re: Pg stuck at 100% cpu, for multiple days

2021-08-30 Thread Justin Pryzby
On Mon, Aug 30, 2021 at 09:09:20PM +0200, Laurenz Albe wrote: > On Mon, 2021-08-30 at 17:18 +0200, hubert depesz lubaczewski wrote: > > The thing is - I can't close it with pg_terminate_backend(), and I'd > > rather not kill -9, as it will, I think, close all other connections, > > and this is

Re: Pg stuck at 100% cpu, for multiple days

2021-08-30 Thread Laurenz Albe
On Mon, 2021-08-30 at 17:18 +0200, hubert depesz lubaczewski wrote: > The thing is - I can't close it with pg_terminate_backend(), and I'd > rather not kill -9, as it will, I think, close all other connections, > and this is prod server. Of course the cause should be fixed, but to serve your

Pg stuck at 100% cpu, for multiple days

2021-08-30 Thread hubert depesz lubaczewski
Hi, Originally I posted it on -general, but Joe Conway suggested I repost in here for greater visibility... We hit a problem with Pg 12.6 (I know, we should upgrade, but that will take long time to prepare). Anyway - it's 12.6 on aarm64. Couple of days there was replication slot started, and