Re: [GENERAL] Query runs fast or slow

2006-04-17 Thread felix
On Sun, Apr 16, 2006 at 04:32:25PM -0400, Tom Lane wrote:
> [EMAIL PROTECTED] writes:
> > EXPLAIN ANALYZE doesn't show the slow timing
> > because it requires values, not $n placeholders,
> 
> To analyze the plan used for a parameterized query, try
> 
>   PREPARE foo(...) AS SELECT ... $n ...
> 
>   EXPLAIN ANALYZE EXECUTE foo(...)
> 
> But I already know what you're going to find: the planner's estimates
> for the range query are not going to be very good when it has no idea
> what the range bounds are.  This is a situation where it may be best
> to absorb the hit of re-planning each time instead of using a generic
> parameterized plan.

OK, here is the new explain analyze.  I eliminated cache effects by
dumping the tables and picking random values with an editor.

felix=> PREPARE foo(TEXT, INT, INT) AS SELECT s.data, g.key, g.val, g.sid FROM 
key k, val_int v, sid s, glue_int g WHERE (k.data = $1 AND k.id = g.key) AND 
(v.data >= $2 AND v.data <= $3) AND v.id = g.val AND g.sid = s.id;
PREPARE
felix=> explain analyze execute foo('mthNQFrmVs3Q4bVruCxIAGy', 1973028023, 
1973028223);
 
QUERY PLAN  
   

 Nested Loop  (cost=1380.11..404223.36 rows=499 width=60) (actual 
time=5785.012..77823.688 rows=1 loops=1)
   ->  Hash Join  (cost=1380.11..402713.38 rows=499 width=16) (actual 
time=5766.308..77804.969 rows=1 loops=1)
 Hash Cond: ("outer".val = "inner".id)
 ->  Nested Loop  (cost=0.00..400829.78 rows=99701 width=16) (actual 
time=115.154..77401.159 rows=10 loops=1)
   ->  Index Scan using key_data_key on "key" k  (cost=0.00..5.82 
rows=1 width=4) (actual time=0.125..0.132 rows=1 loops=1)
 Index Cond: (data = $1)
   ->  Index Scan using glue_int_key_idx on glue_int g  
(cost=0.00..399577.70 rows=99701 width=16) (actual time=115.011..76570.366 
rows=10 loops=1)
 Index Cond: ("outer".id = g."key")
 ->  Hash  (cost=1378.86..1378.86 rows=500 width=4) (actual 
time=11.580..11.580 rows=0 loops=1)
   ->  Index Scan using val_int_data_key on val_int v  
(cost=0.00..1378.86 rows=500 width=4) (actual time=11.556..11.561 rows=1 
loops=1)
 Index Cond: ((data >= $2) AND (data <= $3))
   ->  Index Scan using sid_pkey on sid s  (cost=0.00..3.01 rows=1 width=52) 
(actual time=18.682..18.687 rows=1 loops=1)
 Index Cond: ("outer".sid = s.id)
 Total runtime: 77823.897 ms
(14 rows)

A repeat shows it faster, from 77 seconds to 3.

felix=> explain analyze execute foo('mthNQFrmVs3Q4bVruCxIAGy', 1973028023, 
1973028223);
   
QUERY PLAN  
  
-
 Nested Loop  (cost=1380.11..404223.36 rows=499 width=60) (actual 
time=205.137..2931.899 rows=1 loops=1)
   ->  Hash Join  (cost=1380.11..402713.38 rows=499 width=16) (actual 
time=205.056..2931.803 rows=1 loops=1)
 Hash Cond: ("outer".val = "inner".id)
 ->  Nested Loop  (cost=0.00..400829.78 rows=99701 width=16) (actual 
time=0.148..2564.255 rows=10 loops=1)
   ->  Index Scan using key_data_key on "key" k  (cost=0.00..5.82 
rows=1 width=4) (actual time=0.031..0.039 rows=1 loops=1)
 Index Cond: (data = $1)
   ->  Index Scan using glue_int_key_idx on glue_int g  
(cost=0.00..399577.70 rows=99701 width=16) (actual time=0.105..1808.068 
rows=10 loops=1)
 Index Cond: ("outer".id = g."key")
 ->  Hash  (cost=1378.86..1378.86 rows=500 width=4) (actual 
time=0.090..0.090 rows=0 loops=1)
   ->  Index Scan using val_int_data_key on val_int v  
(cost=0.00..1378.86 rows=500 width=4) (actual time=0.074..0.080 rows=1 loops=1)
 Index Cond: ((data >= $2) AND (data <= $3))
   ->  Index Scan using sid_pkey on sid s  (cost=0.00..3.01 rows=1 width=52) 
(actual time=0.061..0.066 rows=1 loops=1)
 Index Cond: ("outer".sid = s.id)
 Total runtime: 2932.013 ms
(14 rows)

And running it as a simple query shows it much faster, 72 ms.

felix=> EXPLAIN ANALYZE SELECT s.data, g.key, g.val, g.sid FROM key k, val_int 
v, sid s, glue_int g WHERE (k.data = 'mthNQFrmVs3Q4bVruCxIAGy' AND k.id = 
g.key) AND (v.data >= 1973028023 AND v.data <= 1973028223) AND v.id = g.val AND 
g.sid = s.id;
QUERY 
PLAN 
--

Re: [GENERAL] Query runs fast or slow

2006-04-16 Thread felix
On Sun, Apr 16, 2006 at 04:32:25PM -0400, Tom Lane wrote:

> To analyze the plan used for a parameterized query, try
> 
>   PREPARE foo(...) AS SELECT ... $n ...
> 
>   EXPLAIN ANALYZE EXECUTE foo(...)
> 
> But I already know what you're going to find: the planner's estimates
> for the range query are not going to be very good when it has no idea
> what the range bounds are.  This is a situation where it may be best
> to absorb the hit of re-planning each time instead of using a generic
> parameterized plan.

I will try this Monday, but isn't 75 seconds an awful long time?  It
almost seems like even the worst plan could find records faster than
that, and if it were actually scanning everything sequentially, there
would be a fair amount of variation, say 25 seconds, 50 seconds, 100
seconds.  The most I have seen is a range of, say, 75-77.  That just
seems way too slow.

-- 
... _._. ._ ._. . _._. ._. ___ .__ ._. . .__. ._ .. ._.
 Felix Finch: scarecrow repairman & rocket surgeon / [EMAIL PROTECTED]
  GPG = E987 4493 C860 246C 3B1E  6477 7838 76E9 182E 8151 ITAR license #4933
I've found a solution to Fermat's Last Theorem but I see I've run out of room o

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Query runs fast or slow

2006-04-16 Thread Tom Lane
[EMAIL PROTECTED] writes:
> EXPLAIN ANALYZE doesn't show the slow timing
> because it requires values, not $n placeholders,

To analyze the plan used for a parameterized query, try

PREPARE foo(...) AS SELECT ... $n ...

EXPLAIN ANALYZE EXECUTE foo(...)

But I already know what you're going to find: the planner's estimates
for the range query are not going to be very good when it has no idea
what the range bounds are.  This is a situation where it may be best
to absorb the hit of re-planning each time instead of using a generic
parameterized plan.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings