On Sat, Mar 30, 2019 at 3:55 PM Julien Rouhaud <rjuju...@gmail.com> wrote:

> On Sat, Mar 30, 2019 at 2:33 PM Magnus Hagander <mag...@hagander.net>
> wrote:
> >
> > On Wed, Mar 13, 2019 at 4:54 PM Julien Rouhaud <rjuju...@gmail.com>
> wrote:
> >>
> >> On Wed, Mar 13, 2019 at 4:53 PM Julien Rouhaud <rjuju...@gmail.com>
> wrote:
> >> >
> >> > As a result I ended up simply adding counters for the number of total
> >> > checks and the timestamp of the last failure in PgStat_StatDBEntry,
> >> > making attached patch very lightweight.  I moved all the checksum
> >> > related counters out of pg_stat_database in a new pg_stat_checksum
> >> > view.  It avoids to make pg_stat_database too wide, and also allows to
> >> > display information about shared object in this new view (some of the
> >> > other counters don't really make sense for shared objects or could
> >> > break existing monitoring query).  While at it, I tried to add a
> >> > little bit of documentation wrt. checksum monitoring.
> >>
> >> and of course I forgot to attach the patch.
> >
> >
> > Does it really make any sense to track "number of checksum checks"? In
> any sort of interesting database that's just going to be an insanely high
> number, isn't it? (And also, to stay consistent with checksum failures, we
> should of course also count the checks done in base backups, which is not
> in the patch. But I'm more thinking we should drop it)
>
> Thanks for looking at it!
>
> It's surely going to be a huge number on databases with a large number
> of buffer eviction and/or frequent pg_basebackup.  The idea was to be
> able to know if the possible lack of failure was due to lack of check
> at all or because the server appears to be healthy, without spamming
> gettimeofday calls.  If having a last_check() is better, I'm fine with
> it.  If it's useless, let's drop it.
>

I'm not sure either of them are really useful, but would be happy to take
input from others :)


The number of checks was supposed to also be tracked in base_backups, with
>

Oh, that's a sloppy review. I see it's there. However, it doesn't appear to
count up in the *normal* backend path...

My vote is still to drop it completely, but if we're keeping it, it has to
go in both paths.


> Having thought some more about this, I wonder if the right thing to do is
> to actually add a row to pg_stat_database for the global stats, rather than
> invent a separate view. I can see the argument going both ways, but
> particularly with the name pg_stat_checksums we are setting a pattern that
> will create one view for each counter. That's not very good, I think.
> >
> > In the end I'm somewhat split on the idea of pg_stat_database with a
> NULL row or pg_stat_checkpoints. What do others think?
>
> I agree that having a separate view for each counter if a bad idea.
> But what I was thinking is that we'll probably end up with a view to
> track per-db online checksum activation progress/activity/status at
> some point (similar to pg_stat_progress_vacuum), so why not starting
> with this dedicated view right now and add new counters later, either
> in pgstat and/or some shmem, as long as we keep the view name as SQL
> interface.
>

Technically, that should be in pg_stat_progress_checksums to be consistent
:) So whichever way we turn, it's going to be inconsistent with something.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ <http://www.hagander.net/>
 Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>

Reply via email to