Re: Refactoring postmaster's code to cleanup after child exit

2024-10-08 Thread Nazir Bilal Yavuz
Hi, On Tue, 8 Oct 2024 at 00:55, Heikki Linnakangas wrote: > > On 05/10/2024 22:51, Dagfinn Ilmari Mannsåker wrote: > > Heikki Linnakangas writes: > >> Sadly Windows' IO::Socket::UNIX hasn't been implemented on Windows (or > >> at least on this perl distribution we're using in Cirrus CI): > >> >

Re: Refactoring postmaster's code to cleanup after child exit

2024-10-07 Thread Heikki Linnakangas
On 05/10/2024 22:51, Dagfinn Ilmari Mannsåker wrote: Heikki Linnakangas writes: Sadly Windows' IO::Socket::UNIX hasn't been implemented on Windows (or at least on this perl distribution we're using in Cirrus CI): Socket::pack_sockaddr_un not implemented on this architecture at C:/strawberry/5.

Re: Refactoring postmaster's code to cleanup after child exit

2024-10-07 Thread Andres Freund
Hi, On 2024-10-05 20:51:50 +0100, Dagfinn Ilmari Mannsåker wrote: > Socket version 2.028 (included in Perl 5.32) provides pack_sockaddr_un() > on Windows, so that can be fixed by bumping the Perl version in > https://github.com/anarazel/pg-vm-images/blob/main/packer/windows.pkr.hcl > to something

Re: Refactoring postmaster's code to cleanup after child exit

2024-10-05 Thread Dagfinn Ilmari Mannsåker
Heikki Linnakangas writes: > On 05/10/2024 01:03, Thomas Munro wrote: > >> It's possible that Windows copied the Linux behaviour for AF_UNIX, >> given that it probably has something to do with the WSL project for >> emulating Linux, but IDK. > > Sadly Windows' IO::Socket::UNIX hasn't been impleme

Re: Refactoring postmaster's code to cleanup after child exit

2024-10-05 Thread Heikki Linnakangas
On 05/10/2024 01:03, Thomas Munro wrote: On Sat, Oct 5, 2024 at 7:41 AM Heikki Linnakangas wrote: My test for dead-end backends opens 20 TCP (or unix domain) connections to the server, in quick succession. That works fine my system, and it passed cirrus CI on other platforms, but on FreeBSD it

Re: Refactoring postmaster's code to cleanup after child exit

2024-10-04 Thread Thomas Munro
On Sat, Oct 5, 2024 at 7:41 AM Heikki Linnakangas wrote: > My test for dead-end backends opens 20 TCP (or unix domain) connections > to the server, in quick succession. That works fine my system, and it > passed cirrus CI on other platforms, but on FreeBSD it failed > repeatedly. The behavior in t

Re: Refactoring postmaster's code to cleanup after child exit

2024-10-04 Thread Heikki Linnakangas
On 06/09/2024 12:52, Heikki Linnakangas wrote: Unless you have comments on these first two patches which just add tests, I'll commit them shortly. Still processing the rest of your comments... Didn't happen as "shortly" as I thought.. My test for dead-end backends opens 20 TCP (or unix domain

Re: Refactoring postmaster's code to cleanup after child exit

2024-09-10 Thread Andres Freund
Hi, On 2024-09-10 13:33:36 -0400, Robert Haas wrote: > On Tue, Sep 10, 2024 at 12:59 PM Andres Freund wrote: > > I still think that we'd be better off to just return an error to the client > > in > > postmaster, rather than deal with this dead-end children mess. That was > > perhaps justified at

Re: Refactoring postmaster's code to cleanup after child exit

2024-09-10 Thread Andres Freund
Hi, On 2024-08-12 12:55:00 +0300, Heikki Linnakangas wrote: > @@ -2864,6 +2777,8 @@ PostmasterStateMachine(void) >*/ > if (pmState == PM_STOP_BACKENDS) > { > + uint32 targetMask; > + > /* >* Forget any pending requests for back

Re: Refactoring postmaster's code to cleanup after child exit

2024-09-10 Thread Robert Haas
On Tue, Sep 10, 2024 at 12:59 PM Andres Freund wrote: > I still think that we'd be better off to just return an error to the client in > postmaster, rather than deal with this dead-end children mess. That was > perhaps justified at some point, but now it seems to add way more complexity > than it'

Re: Refactoring postmaster's code to cleanup after child exit

2024-09-10 Thread Andres Freund
Hi, On 2024-09-06 16:13:43 +0300, Heikki Linnakangas wrote: > On 04/09/2024 17:35, Andres Freund wrote: > > On 2024-08-12 12:55:00 +0300, Heikki Linnakangas wrote: > > > From dc53f89edbeec99f8633def8aa5f47cd98e7a150 Mon Sep 17 00:00:00 2001 > > > From: Heikki Linnakangas > > > Date: Mon, 12 Aug

Re: Refactoring postmaster's code to cleanup after child exit

2024-09-06 Thread Robert Haas
On Fri, Sep 6, 2024 at 9:13 AM Heikki Linnakangas wrote: > It's currently possible to have up to 2 * max_connections backends in > the authentication phase. We would have to change that behaviour, or > make the PGPROC array 2x larger. I know I already said this elsewhere, but in case it got lost

Re: Refactoring postmaster's code to cleanup after child exit

2024-09-06 Thread Heikki Linnakangas
On 04/09/2024 17:35, Andres Freund wrote: On 2024-08-12 12:55:00 +0300, Heikki Linnakangas wrote: From dc53f89edbeec99f8633def8aa5f47cd98e7a150 Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas Date: Mon, 12 Aug 2024 10:59:04 +0300 Subject: [PATCH v4 4/8] Introduce a separate BackendType for d

Re: Refactoring postmaster's code to cleanup after child exit

2024-09-06 Thread Heikki Linnakangas
On 04/09/2024 17:35, Andres Freund wrote: On 2024-08-12 12:55:00 +0300, Heikki Linnakangas wrote: +Running the tests += + +NOTE: You must have given the --enable-tap-tests argument to configure. + +Run +make check +or +make installcheck +You can use "make installcheck" if

Re: Refactoring postmaster's code to cleanup after child exit

2024-09-04 Thread Andres Freund
Hi, On 2024-08-12 12:55:00 +0300, Heikki Linnakangas wrote: > While rebasing this today, I spotted another instance of that mistake > mentioned in the XXX comment above. I called "CountChildren(B_BACKEND)" > instead of "CountChildren(1 << B_BACKEND)". Some ideas on how to make that > less error-pr

Re: Refactoring postmaster's code to cleanup after child exit

2024-08-18 Thread Heikki Linnakangas
On 18/08/2024 11:00, Alexander Lakhin wrote: 10.08.2024 00:13, Heikki Linnakangas wrote: Committed the patches up to and including this one, with tiny comment changes. I've noticed that on the current HEAD server.log contains binary data (an invalid process name) after a child crash. For exam

Re: Refactoring postmaster's code to cleanup after child exit

2024-08-18 Thread Alexander Lakhin
Hello Heikki, 10.08.2024 00:13, Heikki Linnakangas wrote: Committed the patches up to and including this one, with tiny comment changes. I've noticed that on the current HEAD server.log contains binary data (an invalid process name) after a child crash. For example, while playing with -ftapv,

Re: Refactoring postmaster's code to cleanup after child exit

2024-08-09 Thread Heikki Linnakangas
On 08/08/2024 13:47, Thomas Munro wrote: On Windows, if a child process exits with ERROR_WAIT_NO_CHILDREN, it's now logged with that exit code, instead of 0. Also, if a bgworker exits with ERROR_WAIT_NO_CHILDREN, it's now treated as crashed and is restarted. Previously it was

Re: Refactoring postmaster's code to cleanup after child exit

2024-08-08 Thread Thomas Munro
On Fri, Aug 2, 2024 at 11:57 AM Heikki Linnakangas wrote: > * v3-0001-Make-BackgroundWorkerList-doubly-linked.patch LGTM. > [v3-0002-Refactor-code-to-handle-death-of-a-backend-or-bgw.patch] Currently, when a child process exits, the postmaster first scans through BackgroundWorkerList, t

Re: Refactoring postmaster's code to cleanup after child exit

2024-07-29 Thread Heikki Linnakangas
On 06/07/2024 22:01, Heikki Linnakangas wrote: Reading through postmaster code, I spotted some refactoring opportunities to make it slightly more readable. Currently, when a child process exits, the postmaster first scans through BackgroundWorkerList to see if it was a bgworker process. If not

Refactoring postmaster's code to cleanup after child exit

2024-07-06 Thread Heikki Linnakangas
Reading through postmaster code, I spotted some refactoring opportunities to make it slightly more readable. Currently, when a child process exits, the postmaster first scans through BackgroundWorkerList to see if it was a bgworker process. If not found, it scans through the BackendList to see