We just run into a very simple query that the planner does much worse on than we thought it would (in production the table in question is ~ 100 GB). It surprised us given the planner is generally quite good, so I thought I share our surprise
Setup: postgres_prod@proddb_testing=# select version();[1] version ──────────────────────────────────────────────────────────────────────────────────────────────────────────────── PostgreSQL 9.2.16 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-16), 64-bit (1 row) Time: 69.246 ms postgres_prod@proddb_testing=# create table toy_data3 (the_date date, i int); CREATE TABLE Time: 67.096 ms postgres_prod@proddb_testing=# insert into toy_data3 (select current_date-(s.idx/1000), s.idx from generate_series(1,1000000) as s(idx)); INSERT 0 1000000 Time: 1617.483 ms postgres_prod@proddb_testing=# create index toy_data_date3 on toy_data3(the_date); CREATE INDEX Time: 660.166 ms postgres_prod@proddb_testing=# analyze toy_data3; ANALYZE Time: 294.984 ms The bad behavior: postgres_prod@proddb_testing=# explain analyze select * from ( select td1.the_date, td1.i from toy_data3 td1, toy_data3 td2 where td1.the_date = td2.the_date and td1.i = td2.i ) foo where the_date between current_date and current_date; QUERY PLAN ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Hash Join (cost=55.49..21980.50 rows=1 width=8) (actual time=0.336..179.374 rows=999 loops=1) Hash Cond: ((td2.the_date = td1.the_date) AND (td2.i = td1.i)) -> Seq Scan on toy_data3 td2 (cost=0.00..14425.00 rows=1000000 width=8) (actual time=0.007..72.510 rows=1000000 lo -> Hash (cost=40.44..40.44 rows=1003 width=8) (actual time=0.321..0.321 rows=999 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 40kB -> Index Scan using toy_data_date3 on toy_data3 td1 (cost=0.01..40.44 rows=1003 width=8) (actual time=0.018. Index Cond: ((the_date >= ('now'::cstring)::date) AND (the_date <= ('now'::cstring)::date)) Total runtime: 179.440 ms (8 rows) Time: 246.094 ms Notice the red. Which is sad because one would like it to realize that it could propagate the index constraint onto td2. That is on both sides of the join do the green. As it does correctly when one explicitly uses equality (bold below) (but of course we sometimes have multiple day ranges in production and we only used a single date range above to make it extra interesting for the planner to NOT do a seqscan): postgres_prod@proddb_testing=# explain analyze select * from ( select td1.the_date, td1.i from toy_data3 td1, toy_data3 td2 where td1.the_date = td2.the_date and td1.i = td2.i ) foo where *the_date = current_date*; QUERY PLAN ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Hash Join (cost=50.47..92.17 rows=1 width=8) (actual time=0.300..0.652 rows=999 loops=1) Hash Cond: (td1.i = td2.i) -> Index Scan using toy_data_date3 on toy_data3 td1 (cost=0.00..37.93 rows=1003 width=8) (actual time=0.023..0.169 Index Cond: (the_date = ('now'::cstring)::date) -> Hash (cost=37.93..37.93 rows=1003 width=8) (actual time=0.270..0.270 rows=999 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 40kB -> Index Scan using toy_data_date3 on toy_data3 td2 (cost=0.00..37.93 rows=1003 width=8) (actual time=0.007. Index Cond: (the_date = ('now'::cstring)::date) Total runtime: 0.713 ms (9 rows) Time: 66.904 ms Cheers, Bene