Le vendredi 23 juin 2023, 22:55:51 CEST Peter Eisentraut a écrit :
> On 22.06.23 15:35, Ronan Dunklau wrote:
> > The thing is, by default, those parameters are adjusted dynamically by the
> > glibc itself. It starts with quite small thresholds, and raises them when
> > the program frees some memory, up to a certain limit. This patch proposes
> > a new GUC allowing the user to adjust those settings according to their
> > workload.
> > 
> > This can cause problems. Let's take for example a table with 10k rows, and
> > 32 columns (as defined by a bench script David Rowley shared last year
> > when discussing the GenerationContext for tuplesort), and execute the
> > following
> > query, with 32MB of work_mem:

> I don't follow what you are trying to achieve with this.  The examples
> you show appear to work sensibly in my mind.  Using this setting, you
> can save some of the adjustments that glibc does after the first query.
> But that seems only useful if your session only does one query.  Is that
> what you are doing?

No, not at all: glibc does not do the right thing, we don't "save"  it. 
I will try to rephrase that.

In the first test case I showed, we see that glibc adjusts its threshold, but 
to a suboptimal value since repeated executions of a query needing the same 
amount of memory will release it back to the kernel, and move the brk pointer 
again, and will not adjust it again. On the other hand, by manually adjusting 
the thresholds, we can set them to a higher value which means that the memory 
will be kept in malloc's freelist for reuse for the next queries. As shown in 
the benchmark results I posted, this can have quite a dramatic effect, going 
from 396 tps to 894.  For ease of benchmarking, it is a single query being 
executed over and over again, but the same thing would be true if different 
queries allocating memories were executed by a single backend. 

The worst part of this means it is unpredictable: depending on past memory 
allocation patterns, glibc will end up in different states, and exhibit 
completely different performance for all subsequent queries. In fact, this is 
what Tomas noticed last year, (see [0]),  which led to investigation into 
this. 

I also tried to show that for certain cases glibcs behaviour can be on the 
contrary to greedy, and hold on too much memory if we just need the memory 
once and never allocate it again. 

I hope what I'm trying to achieve is clearer that way. Maybe this patch is not 
the best way to go about this, but since the memory allocator behaviour can 
have such an impact it's a bit sad we have to leave half the performance on 
the table because of it when there are easily accessible knobs to avoid it.

[0] 
https://www.postgresql.org/message-id/bcdd4e3e-c12d-cd2b-7ead-a91ad416100a%40enterprisedb.com




Reply via email to