Re: Flush SLRU counters in checkpointer process
Hi, > I think I've managed to reproduce the issue. The test I've added to check > slru flush was the one failing in the regression suite. A consensus was reached [1] to mark this patch as RwF for now. There are many patches to be reviewed and this one doesn't seem to be in the best shape, so we have to prioritise. Please feel free re-submitting the patch for the next commitfest. [1]: https://postgr.es/m/0737f444-59bb-ac1d-2753-873c40da0840%40eisentraut.org -- Best regards, Aleksander Alekseev
Re: Flush SLRU counters in checkpointer process
I think I've managed to reproduce the issue. The test I've added to check slru flush was the one failing in the regression suite. SELECT SUM(flushes) > :slru_flushes_before FROM pg_stat_slru; ?column? -- t The origin seems to be a race condition on have_slrustats ( https://github.com/postgres/postgres/blob/c8e1ba736b2b9e8c98d37a5b77c4ed31baf94147/src/backend/utils/activity/pgstat_slru.c#L161-L162 ). I will try to get a new patch with improved test stability. On Mon, Jul 3, 2023 at 3:18 PM Daniel Gustafsson wrote: > > On 3 Mar 2023, at 09:06, Anthonin Bonnefoy < > anthonin.bonne...@datadoghq.com> wrote: > > > > Here's the patch rebased with Andres' suggestions. > > Happy to update it if there's any additionalj change required. > > This patch crashes 031_recovery_conflict with a SIGInvalid on Windows, can > you > please investigate and see what might be going on there? The test passed > about > 4 days ago on Windows so unless it's the CI being flaky it should be due > to a > recent change. > > If you don't have access to a Windows environment you can run your own > instrumented builds in your Github account with the CI files in the > postgres > repo. > > -- > Daniel Gustafsson > >
Re: Flush SLRU counters in checkpointer process
> On 3 Mar 2023, at 09:06, Anthonin Bonnefoy > wrote: > > Here's the patch rebased with Andres' suggestions. > Happy to update it if there's any additionalj change required. This patch crashes 031_recovery_conflict with a SIGInvalid on Windows, can you please investigate and see what might be going on there? The test passed about 4 days ago on Windows so unless it's the CI being flaky it should be due to a recent change. If you don't have access to a Windows environment you can run your own instrumented builds in your Github account with the CI files in the postgres repo. -- Daniel Gustafsson
Re: Flush SLRU counters in checkpointer process
Here's the patch rebased with Andres' suggestions. Happy to update it if there's any additionalj change required. On Wed, Mar 1, 2023 at 8:46 PM Gregory Stark (as CFM) wrote: > On Thu, 12 Jan 2023 at 03:46, Anthonin Bonnefoy > wrote: > > > > > > That would make sense. I've created a new patch with everything moved in > pgstat_report_checkpointer(). > > I did split the checkpointer flush in a pgstat_flush_checkpointer() > function as it seemed more readable. Thought? > > This patch appears to need a rebase. Is there really any feedback > needed or is it ready for committer once it's rebased? > > > > -- > Gregory Stark > As Commitfest Manager > flush-slru-counters-v3.patch Description: Binary data
Re: Flush SLRU counters in checkpointer process
On Thu, 12 Jan 2023 at 03:46, Anthonin Bonnefoy wrote: > > > That would make sense. I've created a new patch with everything moved in > pgstat_report_checkpointer(). > I did split the checkpointer flush in a pgstat_flush_checkpointer() function > as it seemed more readable. Thought? This patch appears to need a rebase. Is there really any feedback needed or is it ready for committer once it's rebased? -- Gregory Stark As Commitfest Manager
Re: Flush SLRU counters in checkpointer process
On Wed, Jan 11, 2023 at 5:33 PM Andres Freund wrote: > Hi, > > On 2023-01-11 10:29:06 +0100, Anthonin Bonnefoy wrote: > > Currently, the Checkpointer process only reports SLRU statistics at > server > > shutdown, leading to delayed statistics for SLRU flushes. This patch > adds a > > flush of SLRU stats to the end of checkpoints. > > Hm. I wonder if we should do this even earlier, by the > pgstat_report_checkpointer() calls in CheckpointWriteDelay(). > > I'm inclined to move the pgstat_report_wal() and pgstat_report_slru() calls > into pgstat_report_checkpointer() to avoid needing to care about all the > individual places. > That would make sense. I've created a new patch with everything moved in pgstat_report_checkpointer(). I did split the checkpointer flush in a pgstat_flush_checkpointer() function as it seemed more readable. Thought? > > @@ -505,6 +505,7 @@ CheckpointerMain(void) > > /* Report pending statistics to the cumulative stats > system */ > > pgstat_report_checkpointer(); > > pgstat_report_wal(true); > > + pgstat_report_slru(true); > > Why do we need a force parameter if all callers use it? Good point. I've written the same signature as pgstat_report_wal but there's no need for the nowait parameter. flush-slru-counters-v2.patch Description: Binary data
Re: Flush SLRU counters in checkpointer process
Hi, On 2023-01-11 10:29:06 +0100, Anthonin Bonnefoy wrote: > Currently, the Checkpointer process only reports SLRU statistics at server > shutdown, leading to delayed statistics for SLRU flushes. This patch adds a > flush of SLRU stats to the end of checkpoints. Hm. I wonder if we should do this even earlier, by the pgstat_report_checkpointer() calls in CheckpointWriteDelay(). I'm inclined to move the pgstat_report_wal() and pgstat_report_slru() calls into pgstat_report_checkpointer() to avoid needing to care about all the individual places. > @@ -505,6 +505,7 @@ CheckpointerMain(void) > /* Report pending statistics to the cumulative stats system */ > pgstat_report_checkpointer(); > pgstat_report_wal(true); > + pgstat_report_slru(true); Why do we need a force parameter if all callers use it? Greetings, Andres Freund
Re: Flush SLRU counters in checkpointer process
Hi Anthonin, > This patch adds a flush of SLRU stats to the end of checkpoints. The patch looks good to me and passes the tests but let's see if anyone else has any feedback. Also I added a CF entry: https://commitfest.postgresql.org/42/4120/ -- Best regards, Aleksander Alekseev
Flush SLRU counters in checkpointer process
Hello hackers, Currently, the Checkpointer process only reports SLRU statistics at server shutdown, leading to delayed statistics for SLRU flushes. This patch adds a flush of SLRU stats to the end of checkpoints. Best regards, Anthonin flush-slru-counters.patch Description: Binary data