Hi Tom, v9.2.1 looks good! Aggregate (cost=420808.99..420809.00 rows=1 width=0) (actual time=147.345..147.345 rows=1 loops=1) -> Nested Loop Semi Join (cost=0.00..420786.71 rows=8914 width=0) (actual time=13.847..147.219 rows=894 loops=1) -> Index Scan using notes_retirement_date_project on notes a (cost=0.00..67959.22 rows=12535 width=4) (actual time=13.811..71.741 rows=12469 loops=1) Index Cond: (project_id = 114) -> Nested Loop Semi Join (cost=0.00..28.14 rows=1 width=4) (actual time=0.006..0.006 rows=0 loops=12469) -> Index Scan using note_links_note on note_links b (cost=0.00..12.37 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=12469) Index Cond: (note_id = a.id) Filter: ((retirement_date IS NULL) AND ((entity_type)::text = 'Version'::text)) Rows Removed by Filter: 1 -> Index Scan using versions_pkey on versions c (cost=0.00..15.76 rows=1 width=4) (actual time=0.003..0.003 rows=0 loops=11794) Index Cond: (id = b.entity_id) Filter: ((retirement_date IS NULL) AND ((code)::text ~~* '%comp%'::text)) Rows Removed by Filter: 1 Total runtime: 147.411 ms (14 rows)
On Fri, Sep 28, 2012 at 2:47 PM, Matt Daw <m...@shotgunsoftware.com> wrote: > Hi Tom, thank you very much. I'll load these tables onto a 9.2 instance > and report back. > > Matt > > > On Fri, Sep 28, 2012 at 2:44 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > >> Matt Daw <m...@shotgunsoftware.com> writes: >> > Howdy, I've been debugging a client's slow query today and I'm curious >> > about the query plan. It's picking a plan that hashes lots of rows from >> the >> > versions table (on v9.0.10)... >> >> > EXPLAIN ANALYZE >> > SELECT COUNT(*) FROM notes a WHERE >> > a.project_id = 114 AND >> > EXISTS ( >> > SELECT 1 FROM note_links b >> > WHERE >> > b.note_id = a.id AND >> > b.entity_type = 'Version' AND >> > EXISTS ( >> > SELECT 1 FROM versions c >> > WHERE >> > c.id = b.entity_id AND >> > c.code ILIKE '%comp%' AND >> > c.retirement_date IS NULL >> > ) AND >> > b.retirement_date IS NULL >> > ) >> >> I think the real problem here is that 9.0 is incapable of avoiding a >> full table scan on "note_links", which means it doesn't really have any >> better option than to do the inner EXISTS as a full-table semijoin. >> This is because it can't push a.id down through two levels of join, and >> because the semijoins don't commute, there's no way to get a.id into the >> scan of note_links to pull out only the useful rows. The hack with >> LIMIT avoids this problem by preventing the inner EXISTS from being >> treated as a full-fledged semijoin; but of course that hack leaves you >> vulnerable to very bad plans if the statistics are such that a nestloop >> join isn't the best bet for the inner EXISTS. >> >> The work I did for parameterized paths in 9.2 was intended to address >> exactly this type of scenario. I would be interested to know if 9.2 >> does this any better for you. >> >> regards, tom lane >> > >