Re: [GENERAL] massive performance hit when using "Limit 1"

Rich Doughty Tue, 06 Dec 2005 05:02:14 -0800

Richard Huxton wrote:

Rich Doughty wrote:
This one goes nuts and doesn't return. is there any way i can
force a query plan similar to the one above?

  EXPLAIN SELECT _t.* FROM
       tokens.ta_tokens       _t INNER JOIN
       tokens.ta_tokens_stock _s ON _t.token_id = _s.token_id
  WHERE
       _s.retailer_id = '96599' AND
       _t.value       = '10'
  ORDER BY
       _t.number ASC
  LIMIT '1';
                                                    QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..14967.39 rows=1 width=27)
   ->  Nested Loop  (cost=0.00..22316378.56 rows=1491 width=27)
-> Index Scan using ta_tokens_number_key on ta_tokens _t(cost=0.00..15519868.33 rows=1488768 width=27)
               Filter: ((value)::numeric = 10::numeric)
-> Index Scan using ta_tokens_stock_pkey on ta_tokens_stock_s (cost=0.00..4.55 rows=1 width=4)Index Cond: (("outer".token_id)::integer =(_s.token_id)::integer)
               Filter: ((retailer_id)::integer = 96599)
I *think* what's happening here is that PG thinks it will use the indexon _t.number (since you are going to sort by that anyway) and prettysoon find a row that will:
  1. have value=10
  2. join to a row in _s with the right retailer_id
It turns out that isn't the case, and so the query takes forever.Without knowing what "value" and "number" mean it's difficult to besure, but I'd guess it's the "token_id" join part that's the problem,since at a guess a high-numbered retailer will have tokens withhigh-numbered "retailer_id".
If you'd posted EXPLAIN ANALYSE then we'd be able to see what actuallydid happen.


no chance. it takes far too long to return (days...).

Try the same query but with a low retailer_id (100 or something) and seeif it goes a lot quicker. If that is what the problem is, try changingthe ORDER BY to something like "_s.retailer_id, _t.value, _t.number" andsee if that gives the planner a nudge in the right direction.


the retailer_id would make no difference as thee are only 4000-ish rows in
ta_tokens_stock and they all (for now) have the same retailer_id.

Failing that, a change to your indexes will almost certainly help.


i'm not sure that's the case. the exact same query, but limited to >2 rows
is fine.

I found this in the 8.0.4 relnotes. i reckon its a good guess that's what the
problem is:

* Fix mis-planning of queries with small LIMIT values due to poorly thought
  out "fuzzy" cost comparison



--

  - Rich Doughty

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

Re: [GENERAL] massive performance hit when using "Limit 1"

Reply via email to