On Thu, Jul 17, 2025 at 8:55 PM Craig Ringer <craig.rin...@enterprisedb.com>
wrote:
[snip]

>
> FS-based sizing isn't really enough
> ----------------
>
> Asking users to monitor at the filesystem level works, kind-of, but
> it'll lead to confusion due to WAL and temp files in simple installs.
> To get decent results they will need to have a separate dedicated
> volume for pg_wal. And which temp files are counted will differ; IIRC
> pg_database_size() does not count extents created by an in-progress
> REINDEX etc, but DOES count temp table sizes, for example. FS-based
> monitoring will also include things like spilled pg_replslot spilled
> reorder buffers, which can be considerable and aren't reasonably
> considered part of the "database size" or included in
> pg_database_size(). And of course it can see only the sum of all
> database sizes on a multi-database postgres instance unless the user
> has one volume per database using distinct tablespaces. So
> filesystem-based monitoring is not really a proper replacement.
>

Whether the filesystem creeps above 90%, 95%, etc because of WAL files or
temp files or because of REINDEX or VACUUM FULL / CLUSTER / PG_REPACK is
irrelevant. it's the filesystem at 100% that will ruin your day,

Thus, we monitor filesystems, and don't monitor database size.

If the alarm does ever go off, *then* I check the cause.  (This isn't as
reactionary as it sounds, because I regularly check replication backlog,
for orphan slots, do REINDEXING and CLUSTER one table at a time, and don't
let junk onto the cluster disk.)

-- 
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!

Reply via email to