Re: First draft of the PG 15 release notes (sorting)
On Thu, May 12, 2022 at 01:59:36PM +1200, Thomas Munro wrote: > On Thu, May 12, 2022 at 12:53 PM Bruce Momjian wrote: > > > > > > Improve performance and reduce memory consumption of in-memory > > sorts (Ronan Dunklau, David Rowley, Thomas Munro) > > > > > > I'd also add John Naylor here, as he did a lot of work to validate and > polish the specialisation stuff. Sure, done. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: First draft of the PG 15 release notes (sorting)
On Thu, May 12, 2022 at 12:53 PM Bruce Momjian wrote: > > > Improve performance and reduce memory consumption of in-memory > sorts (Ronan Dunklau, David Rowley, Thomas Munro) > > I'd also add John Naylor here, as he did a lot of work to validate and polish the specialisation stuff.
Re: First draft of the PG 15 release notes (sorting)
On Thu, May 12, 2022 at 10:38:42AM +1200, David Rowley wrote: > On Wed, 11 May 2022 at 14:38, Justin Pryzby wrote: > > I wonder if this is also relevant. > > > > 65014000b35 Replace polyphase merge algorithm with a simple balanced k-way > > merge. > > Thanks for highlighting that. It very much is relevant. In fact, it > seems to account for most of the 25% I mentioned. That particular > test was sorting 10 million tuples with 4MB of work_mem. > > I think that "Improve sorting performance (Heikki Linnakangas)" should > be moved out from "E.1.3.1.2. Indexes" and put below "E.1.3.1.4. > General Performance" Yes, good point, moved. > The text likely should include the words "disk-based" so that it's > clear that it's not the same as the other line about "in-memory > sorts". I'd also be open to just having a single line too. I'd vote > to put Heikki's name first if we did that. > > Maybe: > > * Improve performance of sorting tuples (Heikki Linnakangas, Ronan > Dunklau, David Rowley, Thomas Munro) > > This improves the merging performance of individual on-disk sort > batches, reduces memory consumption for in-memory sorts and reduces > CPU overheads for certain in-memory sorts. I kept separate entries: Improve performance for sorts that exceed work_mem (Heikki Linnakangas) Specifically, switch to a batch sorting algorithm that uses more output streams internally. Improve performance and reduce memory consumption of in-memory sorts (Ronan Dunklau, David Rowley, Thomas Munro) -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: First draft of the PG 15 release notes (sorting)
On Wed, 11 May 2022 at 14:38, Justin Pryzby wrote: > I wonder if this is also relevant. > > 65014000b35 Replace polyphase merge algorithm with a simple balanced k-way > merge. Thanks for highlighting that. It very much is relevant. In fact, it seems to account for most of the 25% I mentioned. That particular test was sorting 10 million tuples with 4MB of work_mem. I think that "Improve sorting performance (Heikki Linnakangas)" should be moved out from "E.1.3.1.2. Indexes" and put below "E.1.3.1.4. General Performance" The text likely should include the words "disk-based" so that it's clear that it's not the same as the other line about "in-memory sorts". I'd also be open to just having a single line too. I'd vote to put Heikki's name first if we did that. Maybe: * Improve performance of sorting tuples (Heikki Linnakangas, Ronan Dunklau, David Rowley, Thomas Munro) This improves the merging performance of individual on-disk sort batches, reduces memory consumption for in-memory sorts and reduces CPU overheads for certain in-memory sorts. David
Re: First draft of the PG 15 release notes (sorting)
On Wed, 11 May 2022 at 14:38, Justin Pryzby wrote: > > On Wed, May 11, 2022 at 12:39:41PM +1200, David Rowley wrote: > > I think the sort improvements done in v15 are worth a mention under > > General Performance. The commits for this were 91e9e89dc, 40af10b57 > > and 697492434. I've been running a few benchmarks between v14 and v15 > > over the past few days and a fairly average case speedup is about 25%. > > but there are cases where I've seen up to 400%. I think the increase > > is to an extent that we maybe should have considered making tweaks in > > cost_tuplesort(). I saw some plans that ran in about 60% of the time > > by disabling Hash Agg and allowing Sort / Group Agg to do the work. > > Is there any reason not to consider it now ? Either for v15 or v15+1. If the changes done had resulted in a change to the number of expected operations as far as big-O notation goes, then I think we might be able to do something. However, nothing changed in the number of operations. We only sped up the constant factors. If it were possible to adjust those constant factors based on some performance benchmarks results that were spat out by some single machine somewhere, then maybe we could do some tweaks. The problem is that to know that we're actually making some meaningful improvements to the costs, we'd want to get the opinion of >1 machine and likely >1 CPU architecture. That feels like something that would be much better to do during a release cycle rather than at this very late hour. The majority of my benchmarks were on AMD zen2 hardware. That's likely not going to reflect well on what the average hardware is that runs PostgreSQL. Also, I've no idea at this stage what we'd even do to cost_tuplesort(). The nruns calculation is a bit fuzzy and never really took the power-of-2 wastage that 40af10b57 reduces. Maybe there's some argument for adjusting the 2.0 constant in compute_cpu_sort_cost() based on what's done in 697492434. But there's plenty of datatypes that don't use the new sort specialization functions. Would we really want to add extra code to the planner to get it to try and figure that out? David
Re: First draft of the PG 15 release notes (sorting)
On Wed, May 11, 2022 at 12:39:41PM +1200, David Rowley wrote: > I think the sort improvements done in v15 are worth a mention under > General Performance. The commits for this were 91e9e89dc, 40af10b57 > and 697492434. I've been running a few benchmarks between v14 and v15 > over the past few days and a fairly average case speedup is about 25%. > but there are cases where I've seen up to 400%. I think the increase > is to an extent that we maybe should have considered making tweaks in > cost_tuplesort(). I saw some plans that ran in about 60% of the time > by disabling Hash Agg and allowing Sort / Group Agg to do the work. Is there any reason not to consider it now ? Either for v15 or v15+1. I wonder if this is also relevant. 65014000b35 Replace polyphase merge algorithm with a simple balanced k-way merge. -- Justin