On Sat, May 18, 2013 at 1:47 AM, Jeff Amiel <becauseimj...@yahoo.com> wrote: > > > > > ----- Original Message ----- > From: Amit Langote <amitlangot...@gmail.com> > To: Jeff Amiel <becauseimj...@yahoo.com> > Cc: "pgsql-general@postgresql.org" <pgsql-general@postgresql.org> > Sent: Friday, May 17, 2013 11:37 AM > Subject: Re: [GENERAL] Why does row estimation on nested loop make no sense > to me > > On Sat, May 18, 2013 at 1:25 AM, Jeff Amiel <becauseimj...@yahoo.com> wrote: >> On most nested loops that I do explain/explain analyze on, the row >> estimation for the nested-loop itself is a product of the inner nodes of the >> nested loop. >> However in this case, I am stumped! >> >> explain >> select era.child_entity from entity_rel era join user_entity ue on >> ue.entity_id = era.parent_entity and ue.user_id=12345 >> >> Nested Loop (cost=0.00..2903.37 rows=29107 width=4) >> -> Index Only Scan using entity_pk on user_entity ue (cost=0.00..62.68 >> rows=2 width=4) >> Index Cond: (user_id = 10954) >> -> Index Scan using rel_parent on entity_rel era (cost=0.00..1261.85 >> rows=317 width=8) >> Index Cond: (parent_entity = ue.entity_id) >> >> >> How can the estimated number of rows for the nested loop node EXCEED the >> product of the 2 row estimates of the tables being joined? >> Not only does it exceed it - but it is orders of magnitude greater. >> >> Am I missing something obvious here? I an see the nested loop row estimate >> being LESS but certainly not more. >> > >> Can you also post the output of explain analyze <your-query>? > > I'm not worried about performance (per se) but the row estimation issue which > propagates up as part of a bigger query. But here ya go: > > explain analyze > select era.child_entity from entity_rel era join user_entity ue on > ue.entity_id = era.parent_entity and ue.user_id=12345 > > Nested Loop (cost=0.00..2903.37 rows=29107 width=4) (actual > time=0.028..0.274 rows=201 loops=1) > -> Index Only Scan using entity_pk on user_entity ue (cost=0.00..62.68 > rows=2 width=4) (actual time=0.011..0.012 rows=1 loops=1) > Index Cond: (user_id = 12345) > Heap Fetches: 1 > -> Index Scan using rel_parent on entity_rel era (cost=0.00..1261.85 > rows=317 width=8) (actual time=0.013..0.164 rows=201 loops=1) > Index Cond: (parent_entity = ue.entity_id) > Total runtime: 0.361 ms
Have you tried analyze (it's probably a case of insufficient/outdated statistics to planner's disposal) or probably consider changing default_statistics_target? -- Amit Langote -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general