Hi,

At a customer we came across a curious plan (see attached testcase).

Given the testcase we see that the outer semi join tries to join the outer with the inner table id columns, even though the middle table id column is also there. Is this expected behavior?

The reason i'm asking is two-fold:
- the inner hash table now is bigger than i'd expect and has columns that you would normally not select on. - the middle join now projects the inner as result, which is quite suprising and seems invalid from a SQL standpoint.

Plan:
 Finalize Aggregate
   Output: count(*)
   ->  Gather
         Output: (PARTIAL count(*))
         Workers Planned: 4
         ->  Partial Aggregate
               Output: PARTIAL count(*)
               ->  Parallel Hash Semi Join
                     Hash Cond: (_outer.id3 = _inner.id2)
                     ->  Parallel Seq Scan on public._outer
                           Output: _outer.id3, _outer.extra1
                     ->  Parallel Hash
                           Output: middle.id1, _inner.id2
                           ->  Parallel Hash Semi Join
                                 Output: middle.id1, _inner.id2
                                 Hash Cond: (middle.id1 = _inner.id2)
                                 ->  Parallel Seq Scan on public.middle
                                       Output: middle.id1
                                 ->  Parallel Hash
                                       Output: _inner.id2
-> Parallel Seq Scan on public._inner
                                             Output: _inner.id2

Kind regards,
Luc
Swarm64

Attachment: testcase.sql
Description: application/sql

Reply via email to