Makes sense. See DB2 8.2 info on their new implementation of cross column statistics. If this is common and you're willing to change code, you can fake that by adding a operation index on some hash function of both columns, and search for both columns and the hash.
----- Original Message ----- From: "Kris Jurka" <[EMAIL PROTECTED]> To: "Gabriele Bartolini" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Monday, October 11, 2004 5:17 PM Subject: Re: [PERFORM] Normal case or bad query plan? > > > On Mon, 11 Oct 2004, Gabriele Bartolini wrote: > > > -------------------------------------------------------------------------- ------------------------------------------- > > Seq Scan on ip2location (cost=0.00..30490.65 rows=124781 width=8) > > (actual time=5338.120..40237.283 rows=1 loops=1) > > Filter: ((1040878301::bigint >= ip_address_from) AND > > (1040878301::bigint <= ip_address_to)) > > Total runtime: 40237.424 ms > > > > I believe the problem is that pg's lack of cross-column statistics is > producing the poor number of rows estimate. The number of rows mataching > just the first 1040878301::bigint >= ip_address_from condition is 122774 > which is roughtly 10% of the table. I imagine the query planner > believes that the other condition alone will match the other 90% of the > table. The problem is that it doesn't know that these two ranges' > intersection is actually tiny. The planner assumes a complete or nearly > complete overlap so it thinks it will need to fetch 10% of the rows from > both the index and the heap and chooses a seqscan. > > Kris Jurka > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html > ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html