Makes sense. See DB2 8.2 info on their new implementation of cross column
statistics. If this is common and you're willing to change code, you can
fake that by adding a operation index on some hash function of both columns,
and search for both columns and the hash.

----- Original Message ----- 
From: "Kris Jurka" <[EMAIL PROTECTED]>
To: "Gabriele Bartolini" <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: Monday, October 11, 2004 5:17 PM
Subject: Re: [PERFORM] Normal case or bad query plan?


>
>
> On Mon, 11 Oct 2004, Gabriele Bartolini wrote:
>
>
> --------------------------------------------------------------------------
-------------------------------------------
> >   Seq Scan on ip2location  (cost=0.00..30490.65 rows=124781 width=8)
> > (actual time=5338.120..40237.283 rows=1 loops=1)
> >     Filter: ((1040878301::bigint >= ip_address_from) AND
> > (1040878301::bigint <= ip_address_to))
> >   Total runtime: 40237.424 ms
> >
>
> I believe the problem is that pg's lack of cross-column statistics is
> producing the poor number of rows estimate.  The number of rows mataching
> just the first 1040878301::bigint >= ip_address_from condition is 122774
> which is roughtly 10% of the table.  I imagine the query planner
> believes that the other condition alone will match the other 90% of the
> table.  The problem is that it doesn't know that these two ranges'
> intersection is actually tiny.  The planner assumes a complete or nearly
> complete overlap so it thinks it will need to fetch 10% of the rows from
> both the index and the heap and chooses a seqscan.
>
> Kris Jurka
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faqs/FAQ.html
>

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faqs/FAQ.html

Reply via email to