Hi,
At a customer we came across a curious plan (see attached testcase).
Given the testcase we see that the outer semi join tries to join the
outer with the inner table id columns, even though the middle table id
column is also there. Is this expected behavior?
The reason i'm asking is two-fold:
- the inner hash table now is bigger than i'd expect and has columns
that you would normally not select on.
- the middle join now projects the inner as result, which is quite
suprising and seems invalid from a SQL standpoint.
Plan:
Finalize Aggregate
Output: count(*)
-> Gather
Output: (PARTIAL count(*))
Workers Planned: 4
-> Partial Aggregate
Output: PARTIAL count(*)
-> Parallel Hash Semi Join
Hash Cond: (_outer.id3 = _inner.id2)
-> Parallel Seq Scan on public._outer
Output: _outer.id3, _outer.extra1
-> Parallel Hash
Output: middle.id1, _inner.id2
-> Parallel Hash Semi Join
Output: middle.id1, _inner.id2
Hash Cond: (middle.id1 = _inner.id2)
-> Parallel Seq Scan on public.middle
Output: middle.id1
-> Parallel Hash
Output: _inner.id2
-> Parallel Seq Scan on
public._inner
Output: _inner.id2
Kind regards,
Luc
Swarm64
testcase.sql
Description: application/sql