On Mon, Jun 22, 2026 at 3:08 PM Corey Huinker <[email protected]> wrote:
> The first bit of dissonance comes from the SQL-level functions having 
> schemaname+relname parameters, and the reloid is resolved via 
> RangeVarGetRelidExtended() which has a callback to check for correct 
> permissions and setting the proper lock level. However, the C-caller would 
> either have the reloid already, or an already open Relation, but no assurance 
> that the caller has the correct permissions for that table or the correct 
> lock level on the table. So either we make an equivalent to 
> RangeVarGetRelidExtended() that takes an oid, or the C-caller has to derive a 
> RangeVar, call the existing RangeVarGetRelidExtended() function, and verify 
> the result reloid against the supplied parameter. I went with deriving the 
> RangeVar and putting an Assert on the before/after reloids, but perhaps the 
> smarter play is to make a function that checks for ShareUpdateExclusiveLock 
> on the Relation, and then does the equivalent of RangeVarCallbackForStats().

I don't think this is really a problem. If the caller is specifying
the OID, they should have called RangeVarGetRelidExtended themselves.
Permissions-checking, locking, and opening the relation should all
happen simultaneously, and the logic shouldn't be duplicated later.

> A smaller bit of dissonance was with RecoveryInProgress(), which if I recall 
> we're checking before RangeVarGetRelidExtended() to avoid trying to take a 
> lock that will fail. That check might not be meaningful if the C call takes a 
> relation, thus ensuring that some level of locking worked, thus we aren't in 
> recovery.

I don't quite follow this part.

> Next is the existing validation functions stats_check_required_arg(), 
> stats_check_arg_array(), and stats_check_arg_pairs() all work on values 
> indexed by the positional functioncallinfo and the corresponding 
> relstatsinfo/attstatsinfo structure, and this makes a lot of type checking 
> and value checking compact and uniform.  If we want to keep this sort of 
> uniformity, the resulting StatsData structure will end up looking a lot like 
> the FunctionCallInfo that we already had.

Yeah, this is worth thinking about. You could consider putting an
array inside the struct and use #defines for the indexes. And instead
of having a separate Boolean for each index, you could use the same
index to reference the N'th bit of a single integer flag variable.

> Next is the fact that the end-destination for every value passed in is a 
> Datum for a pg_statistic heaptuple. Most Datum values are checked only for 
> their null-ness and if they're the correct type, so the value itself is 
> usually just passed directly from fcinfo into the heap tuple values[] array. 
> The float[] values are checked for number of elements and whether any 
> elements are NULL, but that is done via array functions that take a Datum 
> input. Only in a few cases do we actually look at the actual internal value 
> of the Datum (reltuples, attname, attnum, the anyarrays), so there's little 
> to gain there.

Right, so the question is whether it makes more sense to pass down C
strings or Datums.

> There's some additional hassle in the fact that pg_restore_attribute_stats() 
> can take an attnum parameter OR an attname parameter, but not both. I was 
> able to resolve that in a semi-elegant fashion, but the other issues have 
> convinced me that we're probably better off continuing to use the 
> FunctionCallInfo version of attribute_stats_update(), though perhaps with a 
> different name, allowing us to use that name for the new API call instead of 
> import_attribute_statistics().

On that particular point, I think I'm still unconvinced, but I also
haven't looked deeply into this just yet, so maybe I'm wrong.

-- 
Robert Haas
EDB: http://www.enterprisedb.com


Reply via email to