Hi,

Tom Lane wrote:
Hmm ... pattern_sel already applies the operator directly to the
most_common_vals, but in this situation those aren't common enough
to help much.  With such an extensive histogram it is awfully tempting
to assume that the histogram members are a representative sample, and
take the selectivity as being the fraction of histogram entries that
match the pattern.  Maybe drop the first and last histogram entries
on the grounds they're probably outliers.  Thoughts?  What would be a
reasonable minimum histogram size to enable using this approach instead
of the guess-on-the-basis-of-the-pattern code?

That's what I was suggesting here respectively for ltree operators and like:

http://archives.postgresql.org/pgsql-patches/2006-05/msg00178.php
http://archives.postgresql.org/pgsql-performance/2006-01/msg00083.php

My original ltree patch was stripped of the histogram matching code and I will need to re-patch 8.2 when deploying it to get decent performance with a couple of queries, but it would be very nice to avoid it ;)

I cannot see anything bad by using something like that:

if (histogram is large/representative enough)
{
  recalculate_selectivity_matching_histogram_values()

  if (new_selectivity > old_selectivity)
    return new_selectivity
  else
    return old_selectivity
}


Best regards
--
Matteo Beccati
http://phpadsnew.com
http://phppgads.com

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Reply via email to