Thanks, Tomas!

> I know, but it makes them harder to read for people. If you want people
> to respond it's generally a good idea to make it easy to understand the
> question. Don't make them waste their time - they'll just skip the
> message entirely.

Fair point.


> So, the optimizer clearly believes the subquery case has cost 9.15,
> while the inner join case costs 2.15. So it believes the plan is
> "cheaper" than the subquery. So even if it knew how to do the
> transformation / build the other plan (which I'm not sure it can), it
> probably wouldn't do it.

> AFAICS there's no chance to make this bit smarter until the estimates
> get much better to reality.

Got it. Thanks. I guess we'll have to emit correlated subqueries/CTEs.

Sincerely,
Bakhtiyar

On Wed, Jun 21, 2023 at 12:58 PM Tomas Vondra <tomas.von...@enterprisedb.com>
wrote:

> On 6/21/23 20:37, Bəxtiyar Neyman wrote:
> > Thanks Tomas for the lengthy write-up!
> >
> > Pardon the noise in the queries (LATERAL, AND true etc): they were
> > autogenerated by the library we wrote.
> >
>
> I know, but it makes them harder to read for people. If you want people
> to respond it's generally a good idea to make it easy to understand the
> question. Don't make them waste their time - they'll just skip the
> message entirely.
>
> >> Because those queries are not doing the same thing. In the first query
> >> you sort by t3_0 columns, while the "id = 4732455" condition is on the
> >> other table. And so it can't use the index scan for sorting.
> >>
> >> While in the second query it can do that, and it doesn't need to do the
> >> explicit sort (which needs to fetch all the rows etc.).
> >
> > Let me try to explain what both of my queries do:
> > 1) Get the rank of the user using its id (id = 4732455 in this example,
> > but it could have been one that exists, e.g. id = 500). This is LATERAL
> > t3_1 in the first query and subquery in the WHERE clause of the second
> > query.
> > 2) Using that rank, get the next 10 users by rank. This is t3_0.
> >
> > Thus I can't just change the first query to "ORDER BY t3_1."rank" DESC,
> > t3_1."id" DESC" as you suggest, because then the order of returned rows
> > will not be guaranteed. In fact, such a clause will have no effect
> > because there is going to be at most one row supplied by t3_1 anyway.
> >
>
> Ah, OK. I got this wrong.
>
> > My question thus still stands. The planner knows that t3_1 has at most
> > one row, and it knows that t3_0 can produce up to 5000 rows. Yet, it
> > doesn't figure out that it could have lowered the Join Filter condition
> > from the first plan as an Index Cond of the Index Scan of t3_1. Is there
> > a fundamental reason for this, or is this something worth improving in
> > the planner?
> >
>
> As I tried to explain before, I don't think the problem is in the
> planner not being able to do this transformation, but more likely in not
> being able to cost it correctly.
>
> Consider this (with 1M rows in the user_ranks table):
>
> 1) subquery case
> =================
>
>      Limit  (cost=8.87..9.15 rows=10 width=8) (actual time=0.032..0.037
> rows=10 loops=1)
>    Output: t3_0.id, t3_0.rank
>    InitPlan 1 (returns $0,$1)
>      ->  Index Scan using user_ranks_pkey on public.user_ranks t4_0
> (cost=0.42..8.44 rows=1 width=8) (actual time=0.017..0.019 rows=1 loops=1)
>            Output: t4_0.rank, t4_0.id
>            Index Cond: (t4_0.id = 333333)
>    ->  Index Only Scan Backward using "by (rank, id)" on
> public.user_ranks t3_0  (cost=0.42..9493.75 rows=333333 width=8) (actual
> time=0.031..0.033 rows=10 loops=1)
>          Output: t3_0.id, t3_0.rank
>          Index Cond: (ROW(t3_0.rank, t3_0.id) <= ROW($0, $1))
>          Heap Fetches: 0
>  Planning Time: 0.072 ms
>  Execution Time: 0.055 ms
> (12 rows)
>
>
> 2) join
> =======
>
>  Limit  (cost=0.85..2.15 rows=10 width=8) (actual time=464.662..464.672
> rows=10 loops=1)
>    Output: t3_0.id, t3_0.rank
>    ->  Nested Loop  (cost=0.85..43488.87 rows=333333 width=8) (actual
> time=464.660..464.667 rows=10 loops=1)
>          Output: t3_0.id, t3_0.rank
>          Inner Unique: true
>          Join Filter: (ROW(t3_0.rank, t3_0.id) <= ROW(t4_0.rank, t4_0.id))
>          Rows Removed by Join Filter: 666667
>          ->  Index Only Scan Backward using "by (rank, id)" on
> public.user_ranks t3_0  (cost=0.42..25980.42 rows=1000000 width=8)
> (actual time=0.015..93.703 rows=666677 loops=1)
>                Output: t3_0.rank, t3_0.id
>                Heap Fetches: 0
>          ->  Materialize  (cost=0.42..8.45 rows=1 width=8) (actual
> time=0.000..0.000 rows=1 loops=666677)
>                Output: t4_0.rank, t4_0.id
>                ->  Index Scan using user_ranks_pkey on public.user_ranks
> t4_0  (cost=0.42..8.44 rows=1 width=8) (actual time=0.010..0.011 rows=1
> loops=1)
>                      Output: t4_0.rank, t4_0.id
>                      Index Cond: (t4_0.id = 333333)
>  Planning Time: 0.092 ms
>  Execution Time: 464.696 ms
> (17 rows)
>
>
> 3) join (with LEFT JOIN)
> ========================
>
>  Limit  (cost=20038.73..20038.76 rows=10 width=8) (actual
> time=180.714..180.720 rows=10 loops=1)
>    Output: t3_0.id, t3_0.rank
>    ->  Sort  (cost=20038.73..20872.06 rows=333333 width=8) (actual
> time=180.712..180.715 rows=10 loops=1)
>          Output: t3_0.id, t3_0.rank
>          Sort Key: t3_0.rank DESC, t3_0.id DESC
>          Sort Method: top-N heapsort  Memory: 26kB
>          ->  Nested Loop Left Join  (cost=0.85..12835.52 rows=333333
> width=8) (actual time=0.033..122.000 rows=333333 loops=1)
>                Output: t3_0.id, t3_0.rank
>                ->  Index Scan using user_ranks_pkey on public.user_ranks
> t4_0  (cost=0.42..8.44 rows=1 width=8) (actual time=0.018..0.020 rows=1
> loops=1)
>                      Output: t4_0.id, t4_0.rank
>                      Index Cond: (t4_0.id = 333333)
>                ->  Index Only Scan using "by (rank, id)" on
> public.user_ranks t3_0  (cost=0.42..9493.75 rows=333333 width=8) (actual
> time=0.013..49.759 rows=333333 loops=1)
>                      Output: t3_0.rank, t3_0.id
>                      Index Cond: (ROW(t3_0.rank, t3_0.id) <=
> ROW(t4_0.rank, t4_0.id))
>                      Heap Fetches: 0
>  Planning Time: 0.087 ms
>  Execution Time: 180.744 ms
> (17 rows)
>
>
> So, the optimizer clearly believes the subquery case has cost 9.15,
> while the inner join case costs 2.15. So it believes the plan is
> "cheaper" than the subquery. So even if it knew how to do the
> transformation / build the other plan (which I'm not sure it can), it
> probably wouldn't do it.
>
> OTOH if you rewrite it to a left join, it costs 20038.76 - way more than
> the inner join, but it's actually 2x faster.
>
>
> AFAICS there's no chance to make this bit smarter until the estimates
> get much better to reality.
>
>
> --
> Tomas Vondra
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

Reply via email to