On Tue, 9 Apr 2024 at 09:52, David Zhang <david.zh...@highgo.ca> wrote: > However, when executing SELECT min(ctid) and max(ctid), it performs a > Seq Scan, which can be slow for a large table. Is there a way to > retrieve the minimum and maximum ctid other than using the system > functions min() and max()?
Finding the exact ctid seems overkill for what you need. Why you could just find the maximum block with: N = pg_relation_size('name_of_your_table'::regclass) / current_Setting('block_size')::int; and do WHERE ctid < '(N,1)'; If we wanted to optimise this in PostgreSQL, the way to do it would be, around set_plain_rel_pathlist(), check if the relation's ctid is a required PathKey by the same means as create_index_paths() does, then if found, create another seqscan path without synchronize_seqscans * and tag that with the ctid PathKey sending the scan direction according to the PathKey direction. nulls_first does not matter since ctid cannot be NULL. Min(ctid) query should be able to make use of this as the planner should rewrite those to subqueries with a ORDER BY ctid LIMIT 1. * We'd need to invent an actual Path type for SeqScanPath as I see create_seqscan_path() just uses the base struct Path. synchronize_seqscans would have to become a property of that new Path type and it would need to be carried forward into the plan and looked at in the executor so that we always start a scan at the first or last block. Unsure if such a feature is worthwhile. I think maybe not for just min(ctid)/max(ctid). However, there could be other reasons, such as the transform OR to UNION stuff that Tom worked on a few years ago. That needed to eliminate duplicate rows that matched both OR branches and that was done using ctid. David