The following bug has been logged on the website: Bug reference: 7610 Logged by: Sam Wong Email address: s...@hellosam.net PostgreSQL version: 9.2.1 Operating system: Windows 7 64-bit Description:
Repro SQL: CREATE TABLE a (id text, primary key (id)); INSERT INTO a SELECT to_char(generate_series, 'FM0000000') from generate_series(1,1000000); ANALYZE a; Q1: EXPLAIN ANALYZE SELECT * from a where id like '0005000%'; ----------------- Index Only Scan using a_pkey on a (cost=0.00..9.37 rows=100 width=8) (actual time=0.170..0.173 rows=1 loops=1) Index Cond: ((id >= '0005000'::text) AND (id < '0005001'::text)) Filter: (id ~~ '0005000%'::text) Heap Fetches: 1 Total runtime: 0.229 ms (5 rows) Q2: EXPLAIN ANALYZE SELECT * from a where id >= '0005000' and id < '0005001'; ----------------- Index Only Scan using a_pkey on a (cost=0.00..9.37 rows=1 width=8) (actual time=0.027..0.028 rows=1 loops=1) Index Cond: ((id >= '0005000'::text) AND (id < '0005001'::text)) Heap Fetches: 1 Total runtime: 0.072 ms (4 rows) Problems: * For Q1, the planner incorrectly estimates that there will be 100 rows. For Q2, it gives a correct estimation. * My actual problem in the production is that - because of the much larger estimation, it prefers a merge/hash join in the later stage of a complex query, instead of a nested loop. The outcome is that query tooks 10 seconds instead of 100ms. Observations: * Q1 and Q2 should be the logically identical. The psql thinks the same (refer to the Index Cond in the anazyle output) * The analyze output says that Q1 not only has the same index condition of Q2, but with an additional filter, yet surprisingly it is estimated to have more rows than Q2. -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs