>
> Perhaps it's just me ... but it doesn't seem like it's all that many
>
parameters.
>

It's more than I can memorize, so that feels like a lot to me. Clearly it's
not insurmountable.



> > I am a bit concerned about the number of locks on pg_statistic and the
> > relation itself, doing CatalogOpenIndexes/CatalogCloseIndexes once per
> > attribute rather than once per relation. But I also see that this will
> > mostly get used at a time when no other traffic is on the machine, and
> > whatever it costs, it's still faster than the smallest table sample
> (insert
> > joke about "don't have to be faster than the bear" here).
>
> I continue to not be too concerned about this until and unless it's
> actually shown to be an issue.  Keeping things simple and
> straight-forward has its own value.
>

Ok, I'm sold on that plan.


>
> > +/*
> > + * Set statistics for a given pg_class entry.
> > + *
> > + * pg_set_relation_stats(relation Oid, reltuples double, relpages int)
> > + *
> > + * This does an in-place (i.e. non-transactional) update of pg_class,
> just as
> > + * is done in ANALYZE.
> > + *
> > + */
> > +Datum
> > +pg_set_relation_stats(PG_FUNCTION_ARGS)
> > +{
> > +     const char *param_names[] = {
> > +             "relation",
> > +             "reltuples",
> > +             "relpages",
> > +     };
> > +
> > +     Oid                             relid;
> > +     Relation                rel;
> > +     HeapTuple               ctup;
> > +     Form_pg_class   pgcform;
> > +
> > +     for (int i = 0; i <= 2; i++)
> > +             if (PG_ARGISNULL(i))
> > +                     ereport(ERROR,
> > +
>  (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> > +                                      errmsg("%s cannot be NULL",
> param_names[i])));
>
> Why not just mark this function as strict..?  Or perhaps we should allow
> NULLs to be passed in and just not update the current value in that
> case?


Strict could definitely apply here, and I'm inclined to make it so.



> Also, in some cases we allow the function to be called with a
> NULL but then make it a no-op rather than throwing an ERROR (eg, if the
> OID ends up being NULL).


Thoughts on it emitting a WARN or NOTICE before returning false?



>   Not sure if that makes sense here or not
> offhand but figured I'd mention it as something to consider.
>
> > +     pgcform = (Form_pg_class) GETSTRUCT(ctup);
> > +     pgcform->reltuples = PG_GETARG_FLOAT4(1);
> > +     pgcform->relpages = PG_GETARG_INT32(2);
>
> Shouldn't we include relallvisible?
>

Yes. No idea why I didn't have that in there from the start.


> Also, perhaps we should use the approach that we have in ANALYZE, and
> only actually do something if the values are different rather than just
> always doing an update.
>

That was how it worked back in v1, more for the possibility that there was
no matching JSON to set values.

Looking again at analyze.c (currently lines 1751-1780), we just check if
there is a row in the way, and if so we replace it. I don't see where we
compare existing values to new values.


>
> > +/*
> > + * Import statistics for a given relation attribute
> > + *
> > + * pg_set_attribute_stats(relation Oid, attname name, stainherit bool,
> > + *                        stanullfrac float4, stawidth int, stadistinct
> float4,
> > + *                        stakind1 int2, stakind2 int2, stakind3 int3,
> > + *                        stakind4 int2, stakind5 int2, stanumbers1
> float4[],
> > + *                        stanumbers2 float4[], stanumbers3 float4[],
> > + *                        stanumbers4 float4[], stanumbers5 float4[],
> > + *                        stanumbers1 float4[], stanumbers2 float4[],
> > + *                        stanumbers3 float4[], stanumbers4 float4[],
> > + *                        stanumbers5 float4[], stavalues1 text,
> > + *                        stavalues2 text, stavalues3 text,
> > + *                        stavalues4 text, stavalues5 text);
> > + *
> > + *
> > + */
>
> Don't know that it makes sense to just repeat the function declaration
> inside a comment like this- it'll just end up out of date.
>

Historical artifact - previous versions had a long explanation of the JSON
format.



>
> > +Datum
> > +pg_set_attribute_stats(PG_FUNCTION_ARGS)
>
> > +     /* names of columns that cannot be null */
> > +     const char *required_param_names[] = {
> > +             "relation",
> > +             "attname",
> > +             "stainherit",
> > +             "stanullfrac",
> > +             "stawidth",
> > +             "stadistinct",
> > +             "stakind1",
> > +             "stakind2",
> > +             "stakind3",
> > +             "stakind4",
> > +             "stakind5",
> > +     };
>
> Same comment here as above wrt NULL being passed in.
>

In this case, the last 10 params (stanumbersN and stavaluesN) can be null,
and are NULL more often than not.


>
> > +     for (int k = 0; k < 5; k++)
>
> Shouldn't we use STATISTIC_NUM_SLOTS here?
>

Yes, I had in the past. Not sure why I didn't again.

Reply via email to