On Mon, Feb 16, 2026 at 4:38 AM VASUKI M <[email protected]> wrote: > > Hi Andreas, > > Thank you for raising this — it’s a very good design question. > > You’re right that in many practical cases, a user invoking something like > ANALYZE (MODIFIED_STATS) would also want to include relations that currently > have no statistics. From an operational perspective, “missing stats” and > “modified stats” can overlap. > > In my earlier prototype, I did attempt to handle both concerns together. > However, during the previous discussion in the thread, it became clear that > combining the semantics made the behavior less predictable and harder to > reason about. That led to splitting the functionality into two more clearly > defined options: > > MISSING_STATS_ONLY → analyze relations lacking statistics. > > MODIFIED_STATS (proposed) → analyze relations whose statistics may be stale > due to modifications. > > The motivation for separation was semantic clarity: > > MISSING_STATS_ONLY is catalog-based and persistent (derived from pg_statistic > / pg_statistic_ext). > > MODIFIED_STATS would likely depend on modification counters or thresholds > (similar to autoanalyze logic), which are transient and not crash-persistent. > > Keeping them distinct allows each option to have a well-defined and > predictable contract. > > That said, your naming suggestion is interesting. A name such as > SKIP_UNMODIFIED does express the behavior from the inverse perspective and > may indeed be clearer. Another possible direction could be: > > ANALYZE (MISSING_STATS_ONLY) > > ANALYZE (SKIP_UNMODIFIED) > > Or potentially allowing both options together, if that proves semantically > consistent. > > I’m very open to adjusting the naming and/or semantics if the consensus is > that a combined approach would be more practical. >
Well, going back to the beginning of the thread, we have two distinct use cases at the individual level. One (MISSING_STATS) is to quickly go through the database and ensure they have added statistics for anything that might be missing them, like new columns, new extended statistics, etc... The other (MODIFIED_STATS) was having a way to update statistics in active tables for databases with large numbers of static tables in a way similar to how autoanalyze works, but available on demand. While I suspect people will often run both of these together, those are clearly separate concerns and based on the original discussions where this was being hashed out, it is easier to reason about them separately. And while I think you might be able to argue that MODIFIED_STATS should also include MISSING_STATS (I do wonder though, does autoanalyze do that?), given the use case of integrating MISSING_STATS into vacuumdb , it absolutely needs to be a stand alone flag for that scenario. One bookkeeping note for VASUKI, I didn't see any commitfest entries for either patch; I would create one for each of these features separately within https://commitfest.postgresql.org/58/. Robert Treat https://xzilla.net
