Back when we put in the ability to use "x IS NULL" as a btree search
condition, we intentionally left out "x IS NOT NULL", on the grounds
that it is comparable to "x <> something" which is not btree-searchable
either.  However, it occurs to me that we missed a bet here.  The NOT
NULL condition could instead be treated like "x is less than NULL"
(in a NULLS LAST index) or "x is greater than NULL" (in a NULLS FIRST
index), which would make it work like a searchable inequality.  It's
still true that except in the case of a mostly-null column, it would
seldom be worth doing such an indexscan.  However, I can see an
application where an index search condition like this would be highly
worthwhile: namely, trying to extract a column min or max.  Right now,
if there are a fair number of nulls at the end of the index you're
interested in, you have to stupidly scan through them --- but if the
btree code knew about doing this, it could descend the tree
intelligently and land right on the first or last non-null.

We have already seen field complaints about the performance of
index-optimized MAX in cases with many nulls, eg
http://archives.postgresql.org/pgsql-performance/2006-12/msg00099.php
which fixing this would take care of.  This would also affect the
usefulness of the idea I proposed earlier today about automatically
updating the histogram bin boundaries when trying to estimate inequality
selectivity for comparison values near the ends of the range --- if we
can't rely on the index lookup for max or min to be cheap, doing that
stops looking quite so attractive.

While I haven't tried to code this yet, I'm guessing that it's just a
very small addition to the logic we already have.  Any objections to
fixing this up?

                        regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to