Hi, On 2026-04-07 16:16:54 -0400, Tom Lane wrote: > That extra plan node has nonzero cost that I don't think you're > accounting for. It'll still be a win if enough data volume is removed > from the Sort step, but I don't see any consideration of how much > we're actually saving before deciding to add the projection step.
A different way (compared to the heuristics you were subsequently talking about) to deal with that could be to add projection support to sort's output, I guess. That would have a considerably smaller cost than a separate node. It'd add a tiny bit of cost (an if checking for projection) for the rather common of a sort without a projection. Although that'd not be too hard to get rid of by generating specialized ExecSort() variants for the different cases, like now done for ExecSeqScan, if it turned out to matter. However it'd probably would still not be guaranteed faster than evaluating below the sort, due to a) startup cost of the projection machinery b) potentially needing to deform during the expression's inputs during the projection in the sort (or above) I don't know if there are realistic cases where b) matters? You'd have to have nodes above the sort that would require the projection to happen at the sort (or an intermediary level) but then filter out most rows to avoid needing to deform anyway? In all the realistic cases I can think of the expression evaluation should then be pulled up above that filtering out of rows? I don't know if a) is really ever significant enough to matter compared to the cost of sorting. It sure shows up in a sequential scan, but that has an order of magnitude or three lower per tuple cost. Are there cases where something like Chengpeng's logic would still trigger a result node being injected, if sort had projection capability? > So I think we need some sort of gating rule, whereby we only postpone > these expressions if (a) there was already a reason to add a > projection or (b) we can make some cost-based or at least heuristic > estimate that says we'll cut the sort data volume significantly. This reminds me: The heuristics around the cost of expression evaluation seem like they could be improved a fair bit by taking into account the cost of having to deform the input columns. There's a huge difference between func1(col1), func2(col1) and func1(col1), func2(col2) and func1(col30) Unless func* are particularly expensive, the cost will be increasingly dominated by tuple deforming. I think this is one thing that sometimes contributes to us wrongly choosing sequential scans with a qual over an index scans that needs to evaluate far fewer rows, because we just take the operator costs into account, not the tuple deforming it requires. Greetings, Andres Freund
