[PERFORM] Identical Queries

2007-03-01 Thread Rob Schall
Question for anyone...

I tried posting to the bugs, and they said this is a better question for here.
I have to queries. One runs in about 2 seconds. The other takes upwards
of 2 minutes. I have a temp table that is created with 2 columns. This
table is joined with the larger database of call detail records.
However, these 2 queries are handled very differently.

The queries:
First

calldetail=> EXPLAIN SELECT current.* FROM current JOIN anitmp ON
current.destnum=anitmp.ani AND istf=true;
  QUERY PLAN
--
 Nested Loop  (cost=0.00..2026113.09 rows=500908 width=108)
   ->  Seq Scan on anitmp  (cost=0.00..33.62 rows=945 width=8)
 Filter: (istf = true)
   ->  Index Scan using i_destnum on current  (cost=0.00..2137.36
rows=531 width=108)
 Index Cond: (current.destnum = "outer".ani)
(5 rows)

Second
calldetail=> EXPLAIN SELECT current.* FROM current JOIN anitmp ON
current.orignum=anitmp.ani AND istf=false;
QUERY PLAN
---
 Hash Join  (cost=35.99..3402035.53 rows=5381529 width=108)
   Hash Cond: ("outer".orignum = "inner".ani)
   ->  Seq Scan on current  (cost=0.00..907191.05 rows=10170805 width=108)
   ->  Hash  (cost=33.62..33.62 rows=945 width=8)
 ->  Seq Scan on anitmp  (cost=0.00..33.62 rows=945 width=8)
   Filter: (istf = false)
(6 rows)


The tables:
   Table "public.current"
  Column  |Type | Modifiers
--+-+---
 datetime | timestamp without time zone |
 orignum  | bigint  |
 destnum  | bigint  |
 billto   | bigint  |
 cost | numeric(6,4)|
 duration | numeric(8,1)|
 origcity | character(12)   |
 destcity | character(12)   |
 file | character varying(30)   |
 linenum  | integer |
 carrier  | character(1)|
Indexes:
"i_destnum" btree (destnum)
"i_orignum" btree (orignum)


Table "public.anitmp"
 Column |  Type   | Modifiers
+-+---
 ani| bigint  |
 istf   | boolean |


I was also asked to post the EXPLAIN ANALYZE for both:

calldetail=> EXPLAIN ANALYZE SELECT current.* FROM anitmp JOIN current ON 
istf=false AND current.orignum=anitmp.ani;
QUERY PLAN
---
 Hash Join  (cost=35.99..3427123.39 rows=5421215 width=108) (actual 
time=1994.164..157443.544 rows=157 loops=1)
   Hash Cond: ("outer".orignum = "inner".ani)
   ->  Seq Scan on current  (cost=0.00..913881.09 rows=10245809 width=108) 
(actual time=710.986..137963.320 rows=10893541 loops=1)
   ->  Hash  (cost=33.62..33.62 rows=945 width=8) (actual time=10.948..10.948 
rows=0 loops=1)
 ->  Seq Scan on anitmp  (cost=0.00..33.62 rows=945 width=8) (actual 
time=10.934..10.939 rows=2 loops=1)
   Filter: (istf = false)
 Total runtime: 157443.900 ms
(7 rows)

calldetail=> EXPLAIN ANALYZE SELECT current.* FROM current JOIN anitmp ON 
current.destnum=anitmp.ani AND istf=true;
 QUERY PLAN
-
 Nested Loop  (cost=0.00..2037526.69 rows=504602 width=108) (actual 
time=88.752..1050.295 rows=1445 loops=1)
   ->  Seq Scan on anitmp  (cost=0.00..33.62 rows=945 width=8) (actual 
time=8.189..8.202 rows=2 loops=1)
 Filter: (istf = true)
   ->  Index Scan using i_destnum on current  (cost=0.00..2149.40 rows=534 
width=108) (actual time=62.365..517.454 rows=722 loops=2)
 Index Cond: (current.destnum = "outer".ani)
 Total runtime: 1052.862 ms
(6 rows)


Anyone have any ideas for me? I have indexes on each of the necessary
columns.

Rob



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Identical Queries

2007-03-01 Thread Rob Schall
There are 4 entries (wanted to make the playing field level for this
test). There are 2 with true for istf and 2 with false.

Rob


Stephan Szabo wrote:
> On Thu, 1 Mar 2007, Rob Schall wrote:
>
>   
>> Question for anyone...
>>
>> I tried posting to the bugs, and they said this is a better question for 
>> here.
>> I have to queries. One runs in about 2 seconds. The other takes upwards
>> of 2 minutes. I have a temp table that is created with 2 columns. This
>> table is joined with the larger database of call detail records.
>> However, these 2 queries are handled very differently.
>> 
>
> How many rows are there in anitmp and how many rows in anitmp have
> istf=true and how many have istf=false? If you don't currently analyze the
> temp table after adding the rows, you might find that doing an analyze
> helps, or at least makes the row estimates better.
>   


---(end of broadcast)---
TIP 6: explain analyze is your friend