clear functions.

Michael Paquier Tue, 27 Jan 2026 22:08:43 -0800

On Wed, Jan 28, 2026 at 12:44:33AM -0500, Corey Huinker wrote:
> The short answer overcaution. The longer answer is we ingest it because
> pg_mcv_list() returns it (
> https://www.postgresql.org/docs/current/functions-statistics.html) and I
> assumed that it must be necessary for some corner-case input, otherwise why
> would pg_mcv_list() export it? In retrospect, its lack of presence in
> pg_stats view should have been a clue that we could get by without it. I'm
> happy to rip it out and submit another patchset.


It seems to me that this comes down to the text[] representation when
we read this data from the catalogs, where we can just rely on NULL
being in the value, and the official marker in this case:
https://www.postgresql.org/docs/current/arrays.html#ARRAYS-INPUT

That's what your code relies on already anyway when deconstructing the
input received from the restore function.  Doubling on that with an
extra array in input is error-prone IMO, as the overread issue I've
spotted today is proving.  The other two 1-dimension arrays for freqs
and base_freqs are of course needed, we cannot and should not rebuild
them on-the-fly..

For the in-core initial computation, statext_mcv_build() relies on the
data gathered in StatsBuildData during ANALYZE before building the set
of MCV data, where my guess is that it is just more useful to store
this state as-is, and it's been built based on the Datum lookups.
Perhaps we have a couple of specific cases where checking for we want
some NULL-ness knowledge?  It would be less expensive than a full
array deconstruction, for sure, especially if the MCVs are large text
values.
--
Michael

signature.asc
Description: PGP signature

Re: Extended Statistics set/restore/clear functions.

Reply via email to