Hi, At a customer we came across a curious plan (see attached testcase).
Given the testcase we see that the outer semi join tries to join the outer with the inner table id columns, even though the middle table id column is also there. Is this expected behavior?
The reason i'm asking is two-fold:- the inner hash table now is bigger than i'd expect and has columns that you would normally not select on. - the middle join now projects the inner as result, which is quite suprising and seems invalid from a SQL standpoint.
Plan: Finalize Aggregate Output: count(*) -> Gather Output: (PARTIAL count(*)) Workers Planned: 4 -> Partial Aggregate Output: PARTIAL count(*) -> Parallel Hash Semi Join Hash Cond: (_outer.id3 = _inner.id2) -> Parallel Seq Scan on public._outer Output: _outer.id3, _outer.extra1 -> Parallel Hash Output: middle.id1, _inner.id2 -> Parallel Hash Semi Join Output: middle.id1, _inner.id2 Hash Cond: (middle.id1 = _inner.id2) -> Parallel Seq Scan on public.middle Output: middle.id1 -> Parallel Hash Output: _inner.id2-> Parallel Seq Scan on public._inner
Output: _inner.id2 Kind regards, Luc Swarm64
testcase.sql
Description: application/sql