Would it be possible to look at a much larger number of samples during analyze,
then look at the variation in those to generate a reasonable number of
pg_statistic "samples" to represent our estimate of the actual distribution?
More datapoints for tables where the planner might benefit from it, fewer
where it wouldn't.

Maybe it would be possible to take note somewhere of the percentage of occurence of the most common value (in the OP's case, about 3%), in which case a quick decision can be taken to use the index without even looking at the value, if we know the most common one is below the index use threshold...

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Reply via email to