On Sat, Mar 30, 2019 at 3:55 PM Julien Rouhaud <rjuju...@gmail.com> wrote:
> On Sat, Mar 30, 2019 at 2:33 PM Magnus Hagander <mag...@hagander.net> > wrote: > > > > On Wed, Mar 13, 2019 at 4:54 PM Julien Rouhaud <rjuju...@gmail.com> > wrote: > >> > >> On Wed, Mar 13, 2019 at 4:53 PM Julien Rouhaud <rjuju...@gmail.com> > wrote: > >> > > >> > As a result I ended up simply adding counters for the number of total > >> > checks and the timestamp of the last failure in PgStat_StatDBEntry, > >> > making attached patch very lightweight. I moved all the checksum > >> > related counters out of pg_stat_database in a new pg_stat_checksum > >> > view. It avoids to make pg_stat_database too wide, and also allows to > >> > display information about shared object in this new view (some of the > >> > other counters don't really make sense for shared objects or could > >> > break existing monitoring query). While at it, I tried to add a > >> > little bit of documentation wrt. checksum monitoring. > >> > >> and of course I forgot to attach the patch. > > > > > > Does it really make any sense to track "number of checksum checks"? In > any sort of interesting database that's just going to be an insanely high > number, isn't it? (And also, to stay consistent with checksum failures, we > should of course also count the checks done in base backups, which is not > in the patch. But I'm more thinking we should drop it) > > Thanks for looking at it! > > It's surely going to be a huge number on databases with a large number > of buffer eviction and/or frequent pg_basebackup. The idea was to be > able to know if the possible lack of failure was due to lack of check > at all or because the server appears to be healthy, without spamming > gettimeofday calls. If having a last_check() is better, I'm fine with > it. If it's useless, let's drop it. > I'm not sure either of them are really useful, but would be happy to take input from others :) The number of checks was supposed to also be tracked in base_backups, with > Oh, that's a sloppy review. I see it's there. However, it doesn't appear to count up in the *normal* backend path... My vote is still to drop it completely, but if we're keeping it, it has to go in both paths. > Having thought some more about this, I wonder if the right thing to do is > to actually add a row to pg_stat_database for the global stats, rather than > invent a separate view. I can see the argument going both ways, but > particularly with the name pg_stat_checksums we are setting a pattern that > will create one view for each counter. That's not very good, I think. > > > > In the end I'm somewhat split on the idea of pg_stat_database with a > NULL row or pg_stat_checkpoints. What do others think? > > I agree that having a separate view for each counter if a bad idea. > But what I was thinking is that we'll probably end up with a view to > track per-db online checksum activation progress/activity/status at > some point (similar to pg_stat_progress_vacuum), so why not starting > with this dedicated view right now and add new counters later, either > in pgstat and/or some shmem, as long as we keep the view name as SQL > interface. > Technically, that should be in pg_stat_progress_checksums to be consistent :) So whichever way we turn, it's going to be inconsistent with something. -- Magnus Hagander Me: https://www.hagander.net/ <http://www.hagander.net/> Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>