On 7/1/24 16:17, Andrei Lepikhov wrote:
On 10/12/23 14:52, Andy Fan wrote:
Here the sublink can't be pulled up because of its reference to
the  LHS of left join, the original logic is that no matter the 'b.t in ..'
returns the true or false,  the rows in LHS will be returned.  If we
pull it up to LHS, some rows in LHS will be filtered out, which
breaks its original semantics.
Hi,
I spent some time trying to understand your sentence.
I mean the following case:

SELECT * FROM t1 LEFT JOIN t2
   ON t2.x IN (SELECT y FROM t3 WHERE t1.x=t3.x);

I read [1,2,3], but I am still unsure why it is impossible in the case of OUTER JOIN. By setting the LATERAL clause, we forbid any clauses from the RTE subquery to bubble up as a top-level clause and filter tuples from LHS, am I wrong? Does it need more research or you can show some case to support your opinion - why this type of transformation must be disallowed?

[1] https://www.postgresql.org/message-id/6531.1218473967%40sss.pgh.pa.us
[2] https://www.postgresql.org/message-id/BANLkTikGFtGnAaXVh5%3DntRdN%2B4w%2Br%3DNPuw%40mail.gmail.com
[3] https://www.vldb.org/conf/1992/P091.PDF


I delved into it a bit more. After reading [4,5] I invented query that is analogue of the query above, but with manually pulled-up sublink:

EXPLAIN (COSTS OFF)
SELECT * FROM t1 LEFT JOIN t2 JOIN LATERAL
(SELECT t1.x AS x1, y,x FROM t3) q1 ON (t2.x=q1.y AND q1.x1=q1.x) ON true;

And you can see the plan:

 Nested Loop Left Join
   ->  Seq Scan on t1
   ->  Hash Join
         Hash Cond: (t2.x = t3.y)
         ->  Seq Scan on t2
         ->  Hash
               ->  Seq Scan on t3
                     Filter: (t1.x = x)

Just for fun, I played with MSSQL Server and if I read its explain correctly, it also allows pulls-up sublink which mentions LHS:

-------------------------------------
Nested Loops(Left Outer Join, OUTER REFERENCES:(t1.x))
  Table Scan(OBJECT:(t1))
    Hash Match(Right Semi Join, HASH:(t3.y)=(t2.x),
                                RESIDUAL:(t2.x=t3.y))
      Table Scan(OBJECT:(t3), WHERE:(t1.x=t3.x))
    Table Scan(OBJECT:(t2))
-------------------------------------

(I cleaned MSSQL explain a little bit for clarity).
So, may we allow references to LHS in such sublink?

[4] https://www.postgresql.org/message-id/flat/15523.1372190410%40sss.pgh.pa.us [5] https://www.postgresql.org/message-id/20130617235236.GA1636@jeremyevans.local

--
regards, Andrei Lepikhov



Reply via email to