Re: Optional skipping of unchanged relations during ANALYZE?

Robert Treat Wed, 18 Feb 2026 11:32:36 -0800

On Mon, Feb 16, 2026 at 4:38 AM VASUKI M <[email protected]> wrote:
>
> Hi Andreas,
>
> Thank you for raising this — it’s a very good design question.
>
> You’re right that in many practical cases, a user invoking something like 
> ANALYZE (MODIFIED_STATS) would also want to include relations that currently 
> have no statistics. From an operational perspective, “missing stats” and 
> “modified stats” can overlap.
>
> In my earlier prototype, I did attempt to handle both concerns together. 
> However, during the previous discussion in the thread, it became clear that 
> combining the semantics made the behavior less predictable and harder to 
> reason about. That led to splitting the functionality into two more clearly 
> defined options:
>
> MISSING_STATS_ONLY → analyze relations lacking statistics.
>
> MODIFIED_STATS (proposed) → analyze relations whose statistics may be stale 
> due to modifications.
>
> The motivation for separation was semantic clarity:
>
> MISSING_STATS_ONLY is catalog-based and persistent (derived from pg_statistic 
> / pg_statistic_ext).
>
> MODIFIED_STATS would likely depend on modification counters or thresholds 
> (similar to autoanalyze logic), which are transient and not crash-persistent.
>
> Keeping them distinct allows each option to have a well-defined and 
> predictable contract.
>
> That said, your naming suggestion is interesting. A name such as 
> SKIP_UNMODIFIED does express the behavior from the inverse perspective and 
> may indeed be clearer. Another possible direction could be:
>
> ANALYZE (MISSING_STATS_ONLY)
>
> ANALYZE (SKIP_UNMODIFIED)
>
> Or potentially allowing both options together, if that proves semantically 
> consistent.
>
> I’m very open to adjusting the naming and/or semantics if the consensus is 
> that a combined approach would be more practical.
>


Well, going back to the beginning of the thread, we have two distinct
use cases at the individual level. One (MISSING_STATS) is to quickly
go through the database and ensure they have added statistics for
anything that might be missing them, like new columns, new extended
statistics, etc... The other (MODIFIED_STATS) was having a way to
update statistics in active tables for databases with large numbers of
static tables in a way similar to how autoanalyze works, but available
on demand. While I suspect people will often run both of these
together, those are clearly separate concerns and based on the
original discussions where this was being hashed out, it is easier to
reason about them separately. And while I think you might be able to
argue that MODIFIED_STATS should also include MISSING_STATS (I do
wonder though, does autoanalyze do that?), given the use case of
integrating MISSING_STATS into vacuumdb , it absolutely needs to be a
stand alone flag for that scenario.

One bookkeeping note for VASUKI, I didn't see any commitfest entries
for either patch; I would create one for each of these features
separately within https://commitfest.postgresql.org/58/.

Robert Treat
https://xzilla.net

Re: Optional skipping of unchanged relations during ANALYZE?

Reply via email to