On 19/01/23 08:58, David Rowley wrote:

The problem is that the next item looked at is 1 and the value 2 is skipped.

Okay, I see the issue.

I think you are misinterpreting the results, but the main point
remains - it's slower.  The explain analyze timing shows the time
between outputting the first row and the last row. For sort, there's a
lot of work that needs to be done before you output the first row.

Okay got it.

I looked into this a bit further and using the same table as you, and
the attached set of hacks that adjust the ORDER BY path generation to
split a Sort into a Sort and Incremental Sort when the number of
pathkeys to sort by is > 2.


Okay, so this is issue with strict sort itself.

I will play around with current patch but it should be fine for

current work and would account for issue in strict sort.


  I suspect this might be to do with having to perform more swaps in the array elements that we'resorting by with the full sort.  When work_mem is large this array no longer fits in CPU cache and I suspect those swaps become more costly when there are more cache lines having to be fetched from RAM.

Looks like possible reason.

I think we really should fix tuplesort.c so that it batches sorts into
about L3 CPU cache-sized chunks in RAM rather than trying to sort much
larger arrays.


I assume same thing we are doing for incremental sort and that's why it perform

better here?

Or is it due to shorter tuple width and thus being able to fit into memory easily?

On the
other hand, we already sort WindowClauses with the most strict sort
order first which results in a full Sort and no additional sort for
subsequent WindowClauses that can make use of that sort order.  It
would be bizarre to reverse the final few lines of common_prefix_cmp
so that we sort the least strict order first so we end up injecting
Incremental Sorts into the plan to make it faster!

I would see this as beneficial when tuple size is too huge for strict sort.


Basically

cost(sort(a,b,c,d,e)) > cost(sort(a)+sort(b)+sort(c)+ sort(d,e))

Don't know if cost based decision is realistic or not in current

state of things.

I think more benchmarking is required
so we can figure out if this is a corner case or a common case

I will do few more benchmarks. I assume all scaling issue with strict sort

to remain as it is but no further issue due to this change itself comes up.


I'm just unsure if we should write this off as the expected behaviour

of Sort and continue with the patch or delay the whole thing until we

make some improvements to sort.

Unless we found more issues, we might be good with the former but

you can make better call on this. As much as I would love my first patch to get

merged, I would want it to be right.


Thanks,

Ankit




Reply via email to