On Thu, Jul 17, 2025 at 8:55 PM Craig Ringer <craig.rin...@enterprisedb.com> wrote: [snip]
> > FS-based sizing isn't really enough > ---------------- > > Asking users to monitor at the filesystem level works, kind-of, but > it'll lead to confusion due to WAL and temp files in simple installs. > To get decent results they will need to have a separate dedicated > volume for pg_wal. And which temp files are counted will differ; IIRC > pg_database_size() does not count extents created by an in-progress > REINDEX etc, but DOES count temp table sizes, for example. FS-based > monitoring will also include things like spilled pg_replslot spilled > reorder buffers, which can be considerable and aren't reasonably > considered part of the "database size" or included in > pg_database_size(). And of course it can see only the sum of all > database sizes on a multi-database postgres instance unless the user > has one volume per database using distinct tablespaces. So > filesystem-based monitoring is not really a proper replacement. > Whether the filesystem creeps above 90%, 95%, etc because of WAL files or temp files or because of REINDEX or VACUUM FULL / CLUSTER / PG_REPACK is irrelevant. it's the filesystem at 100% that will ruin your day, Thus, we monitor filesystems, and don't monitor database size. If the alarm does ever go off, *then* I check the cause. (This isn't as reactionary as it sounds, because I regularly check replication backlog, for orphan slots, do REINDEXING and CLUSTER one table at a time, and don't let junk onto the cluster disk.) -- Death to <Redacted>, and butter sauce. Don't boil me, I'm still alive. <Redacted> lobster!