[ forgot to respond to this earlier, sorry ]

Robert Haas <robertmh...@gmail.com> writes:
> On a related note, I have some vague unease about planning A SEMI JOIN
> B as A INNER JOIN (UNIQUE B), as make_one_rel currently attempts to
> do.  For a merge join or nested loop, I don't see how this can ever be
> a win over teaching the executor to just not rescan B.  For a hash
> join, it can be a win if B turns out to have duplicates, but then
> again you could also just teach the executor to skip the insertion of
> the duplicate into the table in the first place (it has to hash 'em
> anyway...).  I think maybe I'm not understanding something about the
> logic here.

The case where this is a win is where B is small (say a few rows) and
not unique, and A is large, and there's a relevant index on A.  Then
considering this join approach lets us produce a plan that looks like

        NestLoop
                HashAggregate (or GroupAggregate)
                        Scan B
                IndexScan A
                        Index Condition : A.x = B.y

Every other possible plan for this join involves reading all of A.

If B produces too many rows for the nestloop indexscan to be a win,
then one of the other join approaches will beat out this one in the
cost comparisons.

> One thing I notice is that src/backend/optimizer/README should
> probably be updated with the rules for commuting SEMI and ANTI joins;
> it currently only mentions INNER, LEFT, RIGHT, and FULL.

Yeah, I noticed that too.  How embarrassing.  Will fix it as part of
the patch, which I hope to start on tomorrow.

                        regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to