I wrote:
> * It's not apparent why, if ANALYZE's sample is all nulls, we wouldn't
> conclude stadistinct = 0 and thus arrive at the desired answer that
> way.  (Since we have a complaint, I'm guessing that ANALYZE might
> disbelieve its own result and stick in some larger stadistinct.  But
> then maybe that's where to fix this, not here.)

Oh, on second thought (and with some testing): ANALYZE *does* report
stadistinct = 0.  The real issue is that get_variable_numdistinct is
assuming it can use that value as meaning "stadistinct is unknown".
So maybe we should just fix that, probably by adding an explicit
bool flag for that condition.

BTW ... I've not looked at the callers, but now I'm wondering whether
get_variable_numdistinct ought to count NULL as one of the "distinct"
values.  In applications such as estimating the number of GROUP BY
groups, it seems like that would be correct.  There might be some
callers that don't want it though.

                        regards, tom lane


Reply via email to