I define a table user_ranks as such: CREATE TABLE user_ranks ( id INTEGER GENERATED ALWAYS AS IDENTITY PRIMARY KEY, rank INTEGER NOT NULL, CONSTRAINT "by (rank, id)" UNIQUE (rank, id) );
INSERT INTO user_ranks (user_id, rank) SELECT generate_series(1, 10000), generate_series(1, 10000); Here's a query I'd like to optimize: explain (analyze,verbose) SELECT t3_0."id" AS "id", t3_0."rank" AS "rank" FROM LATERAL ( SELECT t4_0."rank" AS "rank" FROM user_ranks AS t4_0 WHERE (t4_0."id" = 4732455) ) AS t3_1 INNER JOIN user_ranks AS t3_0 ON true WHERE ( ((t3_0."rank", t3_0."id") <= (t3_1."rank", 4732455)) AND true ) ORDER BY t3_0."rank" DESC, t3_0."id" DESC LIMIT 10 It compiles to the following plan: Limit (cost=0.56..250.94 rows=10 width=12) (actual time=8.078..8.078 rows=1 loops=1) Output: t3_0.id, t3_0.rank -> Nested Loop (cost=0.56..41763.27 rows=1668 width=12) (actual time=8.075..8.076 rows=1 loops=1) Output: t3_0.id, t3_0.rank Inner Unique: true Join Filter: (ROW(t3_0.rank, t3_0.id) <= ROW(t4_0.rank, 4732455)) Rows Removed by Join Filter: 5002 -> Index Only Scan Backward using "by (rank,id)" on public.user_ranks t3_0 (cost=0.28..163.33 rows=5003 width=12) (actual time=0.023..0.638 rows=5003 loops=1) Output: t3_0.rank, t3_0.id Heap Fetches: 0 -> Index Scan using "by id" on public.user_ranks t4_0 (cost=0.28..8.30 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=5003) Output: t4_0.id, t4_0.rating, t4_0.rank Index Cond: (t4_0.id = 4732455) As you can see, there are a lot of rows returned by t3_0, which are then filtered by Join Filter. But it would have been better if instead of the filter, the t3_0 table would have an Index Cond. Similar to how it happens when a correlated subquery is used (or a CTE) explain (analyze,verbose) SELECT t3_0."id" AS "id", t3_0."rank" AS "rank" FROM user_ranks AS t3_0 WHERE ( ((t3_0."rank", t3_0."id") <= ( SELECT t4_0."rank" AS "rank", t4_0."id" AS "id" FROM user_ranks AS t4_0 WHERE (t4_0."id" = 4732455) )) AND true ) ORDER BY t3_0."rank" DESC, t3_0."id" DESC LIMIT 10 Limit (cost=8.58..8.95 rows=10 width=12) (actual time=0.062..0.064 rows=1 loops=1) Output: t3_0.id, t3_0.rank InitPlan 1 (returns $0,$1) -> Index Scan using "by id" on public.user_ranks t4_0 (cost=0.28..8.30 rows=1 width=12) (actual time=0.024..0.025 rows=1 loops=1) Output: t4_0.rank, t4_0.id Index Cond: (t4_0.id = 4732455) -> Index Only Scan Backward using "by (rank,id)" on public.user_ranks t3_0 (cost=0.28..61.47 rows=1668 width=12) (actual time=0.061..0.062 rows=1 loops=1) Output: t3_0.id, t3_0.rank Index Cond: (ROW(t3_0.rank, t3_0.id) <= ROW($0, $1)) Heap Fetches: 0 I'm an opposite of a PostgreSQL expert, but it was surprising to me to see that a correlated subquery behaves better than a join. Is this normal? Is it something worth fixing/easy to fix? Sincerely, Bakhtiyar