Re: Can JoinFilter condition be pushed down into IndexScan?

Tomas Vondra Wed, 21 Jun 2023 12:58:35 -0700

On 6/21/23 20:37, Bəxtiyar Neyman wrote:
> Thanks Tomas for the lengthy write-up!
> 
> Pardon the noise in the queries (LATERAL, AND true etc): they were
> autogenerated by the library we wrote.
>


I know, but it makes them harder to read for people. If you want people
to respond it's generally a good idea to make it easy to understand the
question. Don't make them waste their time - they'll just skip the
message entirely.

>> Because those queries are not doing the same thing. In the first query
>> you sort by t3_0 columns, while the "id = 4732455" condition is on the
>> other table. And so it can't use the index scan for sorting.
>>
>> While in the second query it can do that, and it doesn't need to do the
>> explicit sort (which needs to fetch all the rows etc.).
> 
> Let me try to explain what both of my queries do:
> 1) Get the rank of the user using its id (id = 4732455 in this example,
> but it could have been one that exists, e.g. id = 500). This is LATERAL
> t3_1 in the first query and subquery in the WHERE clause of the second
> query.
> 2) Using that rank, get the next 10 users by rank. This is t3_0.
> 
> Thus I can't just change the first query to "ORDER BY t3_1."rank" DESC,
> t3_1."id" DESC" as you suggest, because then the order of returned rows
> will not be guaranteed. In fact, such a clause will have no effect
> because there is going to be at most one row supplied by t3_1 anyway.
> 

Ah, OK. I got this wrong.

> My question thus still stands. The planner knows that t3_1 has at most
> one row, and it knows that t3_0 can produce up to 5000 rows. Yet, it
> doesn't figure out that it could have lowered the Join Filter condition
> from the first plan as an Index Cond of the Index Scan of t3_1. Is there
> a fundamental reason for this, or is this something worth improving in
> the planner?
> 

As I tried to explain before, I don't think the problem is in the
planner not being able to do this transformation, but more likely in not
being able to cost it correctly.

Consider this (with 1M rows in the user_ranks table):

1) subquery case
=================

     Limit  (cost=8.87..9.15 rows=10 width=8) (actual time=0.032..0.037
rows=10 loops=1)
   Output: t3_0.id, t3_0.rank
   InitPlan 1 (returns $0,$1)
     ->  Index Scan using user_ranks_pkey on public.user_ranks t4_0
(cost=0.42..8.44 rows=1 width=8) (actual time=0.017..0.019 rows=1 loops=1)
           Output: t4_0.rank, t4_0.id
           Index Cond: (t4_0.id = 333333)
   ->  Index Only Scan Backward using "by (rank, id)" on
public.user_ranks t3_0  (cost=0.42..9493.75 rows=333333 width=8) (actual
time=0.031..0.033 rows=10 loops=1)
         Output: t3_0.id, t3_0.rank
         Index Cond: (ROW(t3_0.rank, t3_0.id) <= ROW($0, $1))
         Heap Fetches: 0
 Planning Time: 0.072 ms
 Execution Time: 0.055 ms
(12 rows)


2) join
=======

 Limit  (cost=0.85..2.15 rows=10 width=8) (actual time=464.662..464.672
rows=10 loops=1)
   Output: t3_0.id, t3_0.rank
   ->  Nested Loop  (cost=0.85..43488.87 rows=333333 width=8) (actual
time=464.660..464.667 rows=10 loops=1)
         Output: t3_0.id, t3_0.rank
         Inner Unique: true
         Join Filter: (ROW(t3_0.rank, t3_0.id) <= ROW(t4_0.rank, t4_0.id))
         Rows Removed by Join Filter: 666667
         ->  Index Only Scan Backward using "by (rank, id)" on
public.user_ranks t3_0  (cost=0.42..25980.42 rows=1000000 width=8)
(actual time=0.015..93.703 rows=666677 loops=1)
               Output: t3_0.rank, t3_0.id
               Heap Fetches: 0
         ->  Materialize  (cost=0.42..8.45 rows=1 width=8) (actual
time=0.000..0.000 rows=1 loops=666677)
               Output: t4_0.rank, t4_0.id
               ->  Index Scan using user_ranks_pkey on public.user_ranks
t4_0  (cost=0.42..8.44 rows=1 width=8) (actual time=0.010..0.011 rows=1
loops=1)
                     Output: t4_0.rank, t4_0.id
                     Index Cond: (t4_0.id = 333333)
 Planning Time: 0.092 ms
 Execution Time: 464.696 ms
(17 rows)


3) join (with LEFT JOIN)
========================

 Limit  (cost=20038.73..20038.76 rows=10 width=8) (actual
time=180.714..180.720 rows=10 loops=1)
   Output: t3_0.id, t3_0.rank
   ->  Sort  (cost=20038.73..20872.06 rows=333333 width=8) (actual
time=180.712..180.715 rows=10 loops=1)
         Output: t3_0.id, t3_0.rank
         Sort Key: t3_0.rank DESC, t3_0.id DESC
         Sort Method: top-N heapsort  Memory: 26kB
         ->  Nested Loop Left Join  (cost=0.85..12835.52 rows=333333
width=8) (actual time=0.033..122.000 rows=333333 loops=1)
               Output: t3_0.id, t3_0.rank
               ->  Index Scan using user_ranks_pkey on public.user_ranks
t4_0  (cost=0.42..8.44 rows=1 width=8) (actual time=0.018..0.020 rows=1
loops=1)
                     Output: t4_0.id, t4_0.rank
                     Index Cond: (t4_0.id = 333333)
               ->  Index Only Scan using "by (rank, id)" on
public.user_ranks t3_0  (cost=0.42..9493.75 rows=333333 width=8) (actual
time=0.013..49.759 rows=333333 loops=1)
                     Output: t3_0.rank, t3_0.id
                     Index Cond: (ROW(t3_0.rank, t3_0.id) <=
ROW(t4_0.rank, t4_0.id))
                     Heap Fetches: 0
 Planning Time: 0.087 ms
 Execution Time: 180.744 ms
(17 rows)


So, the optimizer clearly believes the subquery case has cost 9.15,
while the inner join case costs 2.15. So it believes the plan is
"cheaper" than the subquery. So even if it knew how to do the
transformation / build the other plan (which I'm not sure it can), it
probably wouldn't do it.

OTOH if you rewrite it to a left join, it costs 20038.76 - way more than
the inner join, but it's actually 2x faster.


AFAICS there's no chance to make this bit smarter until the estimates
get much better to reality.


-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Can JoinFilter condition be pushed down into IndexScan?

Reply via email to