Re: Flush SLRU counters in checkpointer process

2023-09-04 Thread Aleksander Alekseev
Hi,

> I think I've managed to reproduce the issue. The test I've added to check 
> slru flush was the one failing in the regression suite.

A consensus was reached [1] to mark this patch as RwF for now. There
are many patches to be reviewed and this one doesn't seem to be in the
best shape, so we have to prioritise. Please feel free re-submitting
the patch for the next commitfest.

[1]: https://postgr.es/m/0737f444-59bb-ac1d-2753-873c40da0840%40eisentraut.org
-- 
Best regards,
Aleksander Alekseev




Re: Flush SLRU counters in checkpointer process

2023-07-19 Thread Anthonin Bonnefoy
I think I've managed to reproduce the issue. The test I've added to check
slru flush was the one failing in the regression suite.

SELECT SUM(flushes) > :slru_flushes_before FROM pg_stat_slru;
 ?column?
--
 t

The origin seems to be a race condition on have_slrustats (
https://github.com/postgres/postgres/blob/c8e1ba736b2b9e8c98d37a5b77c4ed31baf94147/src/backend/utils/activity/pgstat_slru.c#L161-L162
).
I will try to get a new patch with improved test stability.


On Mon, Jul 3, 2023 at 3:18 PM Daniel Gustafsson  wrote:

> > On 3 Mar 2023, at 09:06, Anthonin Bonnefoy <
> anthonin.bonne...@datadoghq.com> wrote:
> >
> > Here's the patch rebased with Andres' suggestions.
> > Happy to update it if there's any additionalj change required.
>
> This patch crashes 031_recovery_conflict with a SIGInvalid on Windows, can
> you
> please investigate and see what might be going on there?  The test passed
> about
> 4 days ago on Windows so unless it's the CI being flaky it should be due
> to a
> recent change.
>
> If you don't have access to a Windows environment you can run your own
> instrumented builds in your Github account with the CI files in the
> postgres
> repo.
>
> --
> Daniel Gustafsson
>
>


Re: Flush SLRU counters in checkpointer process

2023-07-03 Thread Daniel Gustafsson
> On 3 Mar 2023, at 09:06, Anthonin Bonnefoy  
> wrote:
> 
> Here's the patch rebased with Andres' suggestions. 
> Happy to update it if there's any additionalj change required.

This patch crashes 031_recovery_conflict with a SIGInvalid on Windows, can you
please investigate and see what might be going on there?  The test passed about
4 days ago on Windows so unless it's the CI being flaky it should be due to a
recent change.

If you don't have access to a Windows environment you can run your own
instrumented builds in your Github account with the CI files in the postgres
repo.

--
Daniel Gustafsson





Re: Flush SLRU counters in checkpointer process

2023-03-03 Thread Anthonin Bonnefoy
Here's the patch rebased with Andres' suggestions.
Happy to update it if there's any additionalj change required.


On Wed, Mar 1, 2023 at 8:46 PM Gregory Stark (as CFM) 
wrote:

> On Thu, 12 Jan 2023 at 03:46, Anthonin Bonnefoy
>  wrote:
> >
> >
> > That would make sense. I've created a new patch with everything moved in
> pgstat_report_checkpointer().
> > I did split the checkpointer flush in a pgstat_flush_checkpointer()
> function as it seemed more readable. Thought?
>
> This patch appears to need a rebase. Is there really any feedback
> needed or is it ready for committer once it's rebased?
>
>
>
> --
> Gregory Stark
> As Commitfest Manager
>


flush-slru-counters-v3.patch
Description: Binary data


Re: Flush SLRU counters in checkpointer process

2023-03-01 Thread Gregory Stark (as CFM)
On Thu, 12 Jan 2023 at 03:46, Anthonin Bonnefoy
 wrote:
>
>
> That would make sense. I've created a new patch with everything moved in 
> pgstat_report_checkpointer().
> I did split the checkpointer flush in a pgstat_flush_checkpointer() function 
> as it seemed more readable. Thought?

This patch appears to need a rebase. Is there really any feedback
needed or is it ready for committer once it's rebased?



-- 
Gregory Stark
As Commitfest Manager




Re: Flush SLRU counters in checkpointer process

2023-01-12 Thread Anthonin Bonnefoy
On Wed, Jan 11, 2023 at 5:33 PM Andres Freund  wrote:

> Hi,
>
> On 2023-01-11 10:29:06 +0100, Anthonin Bonnefoy wrote:
> > Currently, the Checkpointer process only reports SLRU statistics at
> server
> > shutdown, leading to delayed statistics for SLRU flushes. This patch
> adds a
> > flush of SLRU stats to the end of checkpoints.
>
> Hm. I wonder if we should do this even earlier, by the
> pgstat_report_checkpointer() calls in CheckpointWriteDelay().
>
> I'm inclined to move the pgstat_report_wal() and pgstat_report_slru() calls
> into pgstat_report_checkpointer() to avoid needing to care about all the
> individual places.
>
That would make sense. I've created a new patch with everything moved in
pgstat_report_checkpointer().
I did split the checkpointer flush in a pgstat_flush_checkpointer()
function as it seemed more readable. Thought?


> > @@ -505,6 +505,7 @@ CheckpointerMain(void)
> >   /* Report pending statistics to the cumulative stats
> system */
> >   pgstat_report_checkpointer();
> >   pgstat_report_wal(true);
> > + pgstat_report_slru(true);
>
> Why do we need a force parameter if all callers use it?

Good point. I've written the same signature as pgstat_report_wal but
there's no need for the nowait parameter.


flush-slru-counters-v2.patch
Description: Binary data


Re: Flush SLRU counters in checkpointer process

2023-01-11 Thread Andres Freund
Hi,

On 2023-01-11 10:29:06 +0100, Anthonin Bonnefoy wrote:
> Currently, the Checkpointer process only reports SLRU statistics at server
> shutdown, leading to delayed statistics for SLRU flushes. This patch adds a
> flush of SLRU stats to the end of checkpoints.

Hm. I wonder if we should do this even earlier, by the
pgstat_report_checkpointer() calls in CheckpointWriteDelay().

I'm inclined to move the pgstat_report_wal() and pgstat_report_slru() calls
into pgstat_report_checkpointer() to avoid needing to care about all the
individual places.


> @@ -505,6 +505,7 @@ CheckpointerMain(void)
>   /* Report pending statistics to the cumulative stats system */
>   pgstat_report_checkpointer();
>   pgstat_report_wal(true);
> + pgstat_report_slru(true);

Why do we need a force parameter if all callers use it?

Greetings,

Andres Freund




Re: Flush SLRU counters in checkpointer process

2023-01-11 Thread Aleksander Alekseev
Hi Anthonin,

> This patch adds a flush of SLRU stats to the end of checkpoints.

The patch looks good to me and passes the tests but let's see if
anyone else has any feedback.

Also I added a CF entry: https://commitfest.postgresql.org/42/4120/

-- 
Best regards,
Aleksander Alekseev




Flush SLRU counters in checkpointer process

2023-01-11 Thread Anthonin Bonnefoy
Hello hackers,

Currently, the Checkpointer process only reports SLRU statistics at server
shutdown, leading to delayed statistics for SLRU flushes. This patch adds a
flush of SLRU stats to the end of checkpoints.

Best regards,
Anthonin


flush-slru-counters.patch
Description: Binary data