On Mon, 7 Apr 2008, Heikki Linnakangas wrote:
In that case, a regular index on (ipFrom, ipTo) should work just fine, and that's what he's got. Actually, an index on just ipFrom would probably work just as well. The problem is that the planner doesn't know about that special relationship between ipFrom and ipTo. Perhaps it could be hinted by explicitly specifying "AND ipTo > ipFrom" in the query?

Actually, the problem is that the database doesn't know that the entries don't overlap. For all it knows, you could have data like this:

0               10
10              20
20              30
... ten million rows later
100000030       100000040
100000040       100000050
0               100000050

So say you wanted to search for the value of 50,000,000. The index on ipFrom would select five million rows, all of which then have to be filtered by the constraint on ipTo. Likewise, an index on ipTo would return five million rows, all of which then have to be filtered by the constraint on ipFrom. If you just read the index and took the closest entry to the value, then you would miss out on the last entry which overlaps with the whole range. An R-tree on both fields will correctly find the small set of entries that are relevant.

It would be very cool to be able to create an R-tree index that would just make the original query run fast without needing alteration. I had a look at this a while back, but it is not currently possible in GiST, because only one field is handed to the index at a time. So all the current R-tree implementations require that you generate an object containing the two values, like the box, and then index that.

Something for 8.4?

Matthew

--
$ rm core
Segmentation Fault (core dumped)

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Reply via email to