On 19/5/2025 15:21, Robert Haas wrote:
On Thu, May 15, 2025 at 9:03 AM Andrei Lepikhov <lepi...@gmail.com> wrote:
2. IncrementalSort is not always more effective - it depends on the
column's number of groups. In my experience, a non-cost-based decision
one day meets the problematic case, and the people who stick with it are
much more confused than in the case when planner decision connected to
the costings - they trust the cost model or the cost model tuned by GUCs.

I wonder if this could be fixed in nodeIncrementalSort.c. I think it's
a problem to rely on planner estimates because planner estimates of
the # of groups are quite unreliable. But at runtime, we can notice
whether the groups are small or large and adjust the algorithm
accordingly. In fact, it looks like we already have some logic for
that:
Yes, the cost estimation issue is the source of my concern. Postgres estimation of the number of groups is still not ideal [1]. The cost sort model doesn't take into account the number of columns [2] - perhaps something else. Therefore, if IncrementalSort performs worse in some cases, this change may result in an increase in user reports.

However, comparing Sort and IncrementalSort (see the attachment), I recall that IncrementalSort surpasses plain Sort in both corner cases: tiny groups (beating Sort in terms of memory consumption and smaller sorting sets) and massive groups (abbreviated keys reduce the number of compared columns). And it is not easy to invent the worst case where the Sort node wins.

The only aspect I can't stop thinking about is the cost balance: sometimes the planner prefers Sort+SeqScan to IncrementalSort+IndexScan (See explains in the attachment). A non-cost-based choice of IncrementalSort may result in an Append path with a higher cost, which can displace it from the path list and lead to another, less effective choice, such as a less effective IndexScan.
However, I can't support this opinion with any real examples.

In toto, IMO it makes sense to commit this feature now and see how it behaves.

[1] https://www.postgresql.org/message-id/ba0edc53-4b1f-4c67-92d1-29aeddb36...@gmail.com [2] https://www.postgresql.org/message-id/8742aaa8-9519-4a1f-91bd-364aec65f5cf%40gmail.com


--
regards, Andrei Lepikhov

Attachment: incremental-sort-bench.sql
Description: application/sql

Reply via email to