On Wed, Mar 13, 2019 at 4:53 PM Julien Rouhaud <rjuju...@gmail.com> wrote:
>
> On Sun, Mar 10, 2019 at 1:13 PM Julien Rouhaud <rjuju...@gmail.com> wrote:
> >
> > On Sat, Mar 9, 2019 at 7:58 PM Julien Rouhaud <rjuju...@gmail.com> wrote:
> > >
> > > On Sat, Mar 9, 2019 at 7:50 PM Magnus Hagander <mag...@hagander.net> 
> > > wrote:
> > > >
> > > > On Sat, Mar 9, 2019 at 10:41 AM Julien Rouhaud <rjuju...@gmail.com> 
> > > > wrote:
> > > >>
> > > >> Sorry, I have again new comments after a little bit more thinking.
> > > >> I'm wondering if we can do something about shared objects while we're
> > > >> at it.  They don't belong to any database, so it's a little bit
> > > >> orthogonal to this proposal, but it seems quite important to track
> > > >> error on those too!
> > > >>
> > > >> What about adding a new field in PgStat_GlobalStats for that?  We can
> > > >> use the same lastDir to easily detect such objects and slightly adapt
> > > >> sendFile again, which seems quite straightforward.
> > >
> > > > Question is then what number that should show -- only the checksum 
> > > > counter in non-database-fields, or the total number across the cluster?
> > >
> > > I'd say only for non-database-fields errors, especially if we can
> > > reset each counters separately.  If necessary, we can add a new view
> > > to give a global overview of checksum errors for DBA convenience.
> >
> > I'm considering adding a new PgStat_ChecksumStats for that purpose
> > instead, but I don't know if that's acceptable to do so in the last
> > commitfest.  It seems worthwhile to add it eventually, since we'll
> > probably end up having more things to report to users related to
> > checksum.  Online enabling of checksum could be the most immediate
> > potential target.
>
> I wasn't aware that we were already storing informations about shared
> objects in PgStat_StatDBEntry, with an InvalidOid as databaseid
> (though we don't have any system view that are actually showing
> information for such objects).
>
> As a result I ended up simply adding counters for the number of total
> checks and the timestamp of the last failure in PgStat_StatDBEntry,
> making attached patch very lightweight.  I moved all the checksum
> related counters out of pg_stat_database in a new pg_stat_checksum
> view.  It avoids to make pg_stat_database too wide, and also allows to
> display information about shared object in this new view (some of the
> other counters don't really make sense for shared objects or could
> break existing monitoring query).  While at it, I tried to add a
> little bit of documentation wrt. checksum monitoring.

and of course I forgot to attach the patch.

Attachment: pg_stat_checksum-v1.diff
Description: Binary data

Reply via email to