> > Perhaps it's just me ... but it doesn't seem like it's all that many > parameters. >
It's more than I can memorize, so that feels like a lot to me. Clearly it's not insurmountable. > > I am a bit concerned about the number of locks on pg_statistic and the > > relation itself, doing CatalogOpenIndexes/CatalogCloseIndexes once per > > attribute rather than once per relation. But I also see that this will > > mostly get used at a time when no other traffic is on the machine, and > > whatever it costs, it's still faster than the smallest table sample > (insert > > joke about "don't have to be faster than the bear" here). > > I continue to not be too concerned about this until and unless it's > actually shown to be an issue. Keeping things simple and > straight-forward has its own value. > Ok, I'm sold on that plan. > > > +/* > > + * Set statistics for a given pg_class entry. > > + * > > + * pg_set_relation_stats(relation Oid, reltuples double, relpages int) > > + * > > + * This does an in-place (i.e. non-transactional) update of pg_class, > just as > > + * is done in ANALYZE. > > + * > > + */ > > +Datum > > +pg_set_relation_stats(PG_FUNCTION_ARGS) > > +{ > > + const char *param_names[] = { > > + "relation", > > + "reltuples", > > + "relpages", > > + }; > > + > > + Oid relid; > > + Relation rel; > > + HeapTuple ctup; > > + Form_pg_class pgcform; > > + > > + for (int i = 0; i <= 2; i++) > > + if (PG_ARGISNULL(i)) > > + ereport(ERROR, > > + > (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > > + errmsg("%s cannot be NULL", > param_names[i]))); > > Why not just mark this function as strict..? Or perhaps we should allow > NULLs to be passed in and just not update the current value in that > case? Strict could definitely apply here, and I'm inclined to make it so. > Also, in some cases we allow the function to be called with a > NULL but then make it a no-op rather than throwing an ERROR (eg, if the > OID ends up being NULL). Thoughts on it emitting a WARN or NOTICE before returning false? > Not sure if that makes sense here or not > offhand but figured I'd mention it as something to consider. > > > + pgcform = (Form_pg_class) GETSTRUCT(ctup); > > + pgcform->reltuples = PG_GETARG_FLOAT4(1); > > + pgcform->relpages = PG_GETARG_INT32(2); > > Shouldn't we include relallvisible? > Yes. No idea why I didn't have that in there from the start. > Also, perhaps we should use the approach that we have in ANALYZE, and > only actually do something if the values are different rather than just > always doing an update. > That was how it worked back in v1, more for the possibility that there was no matching JSON to set values. Looking again at analyze.c (currently lines 1751-1780), we just check if there is a row in the way, and if so we replace it. I don't see where we compare existing values to new values. > > > +/* > > + * Import statistics for a given relation attribute > > + * > > + * pg_set_attribute_stats(relation Oid, attname name, stainherit bool, > > + * stanullfrac float4, stawidth int, stadistinct > float4, > > + * stakind1 int2, stakind2 int2, stakind3 int3, > > + * stakind4 int2, stakind5 int2, stanumbers1 > float4[], > > + * stanumbers2 float4[], stanumbers3 float4[], > > + * stanumbers4 float4[], stanumbers5 float4[], > > + * stanumbers1 float4[], stanumbers2 float4[], > > + * stanumbers3 float4[], stanumbers4 float4[], > > + * stanumbers5 float4[], stavalues1 text, > > + * stavalues2 text, stavalues3 text, > > + * stavalues4 text, stavalues5 text); > > + * > > + * > > + */ > > Don't know that it makes sense to just repeat the function declaration > inside a comment like this- it'll just end up out of date. > Historical artifact - previous versions had a long explanation of the JSON format. > > > +Datum > > +pg_set_attribute_stats(PG_FUNCTION_ARGS) > > > + /* names of columns that cannot be null */ > > + const char *required_param_names[] = { > > + "relation", > > + "attname", > > + "stainherit", > > + "stanullfrac", > > + "stawidth", > > + "stadistinct", > > + "stakind1", > > + "stakind2", > > + "stakind3", > > + "stakind4", > > + "stakind5", > > + }; > > Same comment here as above wrt NULL being passed in. > In this case, the last 10 params (stanumbersN and stavaluesN) can be null, and are NULL more often than not. > > > + for (int k = 0; k < 5; k++) > > Shouldn't we use STATISTIC_NUM_SLOTS here? > Yes, I had in the past. Not sure why I didn't again.