On Mon, May 16, 2011 at 3:30 PM, Clemens Eisserer <linuxhi...@gmail.com>wrote:
> Hi, > > I have a quite complex, performance sensitive query in a system with a > few (7) joins: > select .... from t1 left join t2 .... WHERE id IN (select ....) > > For this query the planner evaluates the IN with a hash semi join last, > and all the joining is done by hash joins for all rows contained in t1. > > However when I specify the ids manually (IN (1, 2, 3, 4, 5) the > planner first does an index lookup on the primary key column id, > and subsequently does nested loop joins using an index on t2 - which > gives way better results. > > Is there any way to guide the planner to evaluate the IN condition > first, instead of last? > Why is the planner behaving this way? (postgresql 8.4.??) > > Thank you in advance, Clemens > > > Query plan with IN(select): > > Sort (cost=165.77..165.77 rows=2 width=16974) (actual > time=13.459..13.460 rows=2 loops=1) > Sort Key: this_.id > Sort Method: quicksort Memory: 26kB > -> Hash Semi Join (cost=123.09..165.76 rows=2 width=16974) > (actual time=12.741..13.432 rows=2 loops=1) > Hash Cond: (this_.id = kladdenent0_.id) > -> Hash Left Join (cost=119.17..160.90 rows=348 > width=16974) (actual time=8.765..13.104 rows=342 loops=1) > Hash Cond: (flugzeug2_.flugzeugtyp_id = flugzeugty3_.id) > -> Hash Left Join (cost=118.10..155.08 rows=348 > width=16454) (actual time=8.724..12.412 rows=342 loops=1) > Hash Cond: (flugzeug2_.zaehlertyp_id = bmintype4_.id) > -> Hash Left Join (cost=117.06..152.71 rows=348 > width=15934) (actual time=8.660..11.786 rows=342 loops=1) > Hash Cond: (this_.lehrerid = pilot5_.id) > -> Hash Left Join (cost=96.66..130.46 > rows=348 width=8912) (actual time=6.395..8.899 rows=342 loops=1) > Hash Cond: (this_.nachid = flugplatz6_.id) > -> Hash Left Join > (cost=93.89..122.90 rows=348 width=8370) (actual time=6.354..8.429 > rows=342 loops=1) > Hash Cond: (this_.flugzeugid = > flugzeug2_.id) > -> Hash Left Join > (cost=23.17..47.04 rows=348 width=7681) (actual time=1.992..3.374 > rows=342 loops=1) > Hash Cond: (this_.pilotid > = pilot7_.id) > -> Hash Left Join > (cost=2.78..22.04 rows=348 width=659) (actual time=0.044..0.548 > rows=342 loops=1) > Hash Cond: > (this_.vonid = flugplatz8_.id) > -> Seq Scan on > startkladde this_ (cost=0.00..14.48 rows=348 width=117) (actual > time=0.004..0.074 rows=342 loops=1) > -> Hash > (cost=1.79..1.79 rows=79 width=542) (actual time=0.032..0.032 rows=79 > loops=1) > -> Seq Scan > on flugplatz flugplatz8_ (cost=0.00..1.79 rows=79 width=542) (actual > time=0.003..0.010 rows=79 loops=1) > -> Hash > (cost=15.73..15.73 rows=373 width=7022) (actual time=1.938..1.938 > rows=375 loops=1) > -> Seq Scan on > pilot pilot7_ (cost=0.00..15.73 rows=373 width=7022) (actual > time=0.006..0.769 rows=375 loops=1) > -> Hash (cost=51.43..51.43 > rows=1543 width=689) (actual time=4.351..4.351 rows=1543 loops=1) > -> Seq Scan on flugzeug > flugzeug2_ (cost=0.00..51.43 rows=1543 width=689) (actual > time=0.006..1.615 rows=1543 loops=1) > -> Hash (cost=1.79..1.79 rows=79 > width=542) (actual time=0.031..0.031 rows=79 loops=1) > -> Seq Scan on flugplatz > flugplatz6_ (cost=0.00..1.79 rows=79 width=542) (actual > time=0.003..0.011 rows=79 loops=1) > -> Hash (cost=15.73..15.73 rows=373 > width=7022) (actual time=2.236..2.236 rows=375 loops=1) > -> Seq Scan on pilot pilot5_ > (cost=0.00..15.73 rows=373 width=7022) (actual time=0.005..0.781 > rows=375 loops=1) > -> Hash (cost=1.02..1.02 rows=2 width=520) > (actual time=0.005..0.005 rows=2 loops=1) > -> Seq Scan on bmintype bmintype4_ > (cost=0.00..1.02 rows=2 width=520) (actual time=0.003..0.004 rows=2 > loops=1) > -> Hash (cost=1.03..1.03 rows=3 width=520) (actual > time=0.004..0.004 rows=3 loops=1) > -> Seq Scan on flugzeugtype flugzeugty3_ > (cost=0.00..1.03 rows=3 width=520) (actual time=0.002..0.002 rows=3 > loops=1) > -> Hash (cost=3.90..3.90 rows=2 width=4) (actual > time=0.239..0.239 rows=2 loops=1) > -> Limit (cost=0.00..3.88 rows=2 width=4) (actual > time=0.202..0.236 rows=2 loops=1) > -> Index Scan using startkladde_pkey on > startkladde kladdenent0_ (cost=0.00..56.24 rows=29 width=4) (actual > time=0.200..0.233 rows=2 loops=1) > Filter: ((status > 0) OR (id = (-1))) > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance > In some cases, I've seen improved results when replacing the IN (...) with = ANY(ARRAY(...)). Dave