On 12 March 2016 at 11:43, Tom Lane <t...@sss.pgh.pa.us> wrote: > I wrote: >> I wondered why, instead of inventing an extra semantics-modifying flag, >> we couldn't just change the jointype to *be* JOIN_SEMI when we've >> discovered that the inner side is unique. > > BTW, to clarify: I'm not imagining that we'd make this change in the > query jointree, as for example prepjointree.c might do. That would appear > to add join order constraints, which we don't want. But I'm thinking that > at the instant where we form a join Path, we could change the Path's > jointype to be JOIN_SEMI or JOIN_SEMI_OUTER if we're able to prove the > inner side unique, rather than annotating the Path with a separate flag. > Then that representation is what propagates forward.
I've attached a patch which implements this, although I did call the new join type JOIN_LEFT_UNIQUE rather than JOIN_SEMI_OUTER. I'm not all that confident that I've not added handling for JOIN_LEFT_UNIQUE in places where it's not possible to get that join type, I did leave out a good number of places where I know it's not possible. I'm also not sure with too much certainty that I've got all cases correct, but the regression tests pass. The patch is more intended for assisting discussion than as a ready to commit patch. > It seems like the major intellectual complexity here is to figure out > how to detect inner-side-unique at reasonable cost. I see that for > LEFT joins you're caching that in the SpecialJoinInfos, which is probably > fine. But for INNER joins it looks like you're just doing it over again > for every candidate join, and that seems mighty expensive. I have a rough idea for this, but I need to think of it a bit more to make sure it's bulletproof. I also just noticed (since it's been a while since I wrote the patch) that in add_paths_to_joinrel() that innerrel can naturally be a joinrel too, and we can fail to find uniqueness in that joinrel. I think it should be possible to analyse the join rel too and search for a base rel which supports the distinctness, and then ensure none of the other rels which make up the join rel can cause tuple duplication of that rel. But this just causes missed optimisation opportunities. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
unique_joins_82d2a07_2016-03-14.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers