[BUGS] BUG #7610: planner get wrong rows estimate with LIKE operator

sam Tue, 16 Oct 2012 23:35:59 -0700

The following bug has been logged on the website:

Bug reference:      7610
Logged by:          Sam Wong
Email address:      [email protected]
PostgreSQL version: 9.2.1
Operating system:   Windows 7 64-bit
Description:


Repro SQL:

CREATE TABLE a (id text, primary key (id));

INSERT INTO a SELECT to_char(generate_series, 'FM0000000') from
generate_series(1,1000000);

ANALYZE a;

Q1: EXPLAIN ANALYZE SELECT * from a where id like '0005000%';
-----------------
 Index Only Scan using a_pkey on a  (cost=0.00..9.37 rows=100 width=8)
(actual time=0.170..0.173 rows=1 loops=1)
   Index Cond: ((id >= '0005000'::text) AND (id < '0005001'::text))
   Filter: (id ~~ '0005000%'::text)
   Heap Fetches: 1
 Total runtime: 0.229 ms
(5 rows)

Q2: EXPLAIN ANALYZE SELECT * from a where id >= '0005000' and id <
'0005001';
-----------------
 Index Only Scan using a_pkey on a  (cost=0.00..9.37 rows=1 width=8) (actual
time=0.027..0.028 rows=1 loops=1)
   Index Cond: ((id >= '0005000'::text) AND (id < '0005001'::text))
   Heap Fetches: 1
 Total runtime: 0.072 ms
(4 rows)

Problems:
 * For Q1, the planner incorrectly estimates that there will be 100 rows.
For Q2, it gives a correct estimation.
 * My actual problem in the production is that - because of the much larger
estimation, it prefers a merge/hash join in the later stage of a complex
query, instead of a nested loop. The outcome is that query tooks 10 seconds
instead of 100ms.

Observations:
 * Q1 and Q2 should be the logically identical. The psql thinks the same
(refer to the Index Cond in the anazyle output)
 * The analyze output says that Q1 not only has the same index condition of
Q2, but with an additional filter, yet surprisingly it is estimated to have
more rows than Q2.



-- 
Sent via pgsql-bugs mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

[BUGS] BUG #7610: planner get wrong rows estimate with LIKE operator

Reply via email to