Re: Add the ability to limit the amount of memory that can be allocated to backends.
Hello Anton, 14.03.2024 23:36, Anton A. Melnikov wrote: On 13.03.2024 10:41, Anton A. Melnikov wrote: Here is a version updated for the current master. During patch updating i mistakenly added double counting of deallocatated blocks. That's why the tests in the patch tester failed. Fixed it and squashed fix 0002 with 0001. Here is fixed version. Please try the following with the patches applied: echo "shared_buffers = '1MB' max_total_backend_memory = '10MB'" > /tmp/extra.config CPPFLAGS="-Og" ./configure --enable-tap-tests --enable-debug --enable-cassert ... TEMP_CONFIG=/tmp/extra.config make check It fails for me as follows: ... # postmaster did not respond within 60 seconds, examine ".../src/test/regress/log/postmaster.log" for the reason ... src/test/regress/log/postmaster.log contains: ... TRAP: failed Assert("ret != NULL"), File: "mcxt.c", Line: 1327, PID: 4109270 TRAP: failed Assert("ret != NULL"), File: "mcxt.c", Line: 1327, PID: 4109271 postgres: autovacuum launcher (ExceptionalCondition+0x69)[0x55ce441fcc6e] postgres: autovacuum launcher (palloc0+0x0)[0x55ce4422eb67] postgres: logical replication launcher (ExceptionalCondition+0x69)[0x55ce441fcc6e] postgres: autovacuum launcher (InitDeadLockChecking+0xa6)[0x55ce4408a6f0] postgres: logical replication launcher (palloc0+0x0)[0x55ce4422eb67] postgres: logical replication launcher (InitDeadLockChecking+0x45)[0x55ce4408a68f] postgres: autovacuum launcher (InitProcess+0x600)[0x55ce4409c6f2] postgres: logical replication launcher (InitProcess+0x600)[0x55ce4409c6f2] postgres: autovacuum launcher (+0x44b4e2)[0x55ce43ff24e2] ... grep TRAP src/test/regress/log/postmaster.log | wc -l 445 Best regards, Alexander
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On 13.03.2024 10:41, Anton A. Melnikov wrote: Here is a version updated for the current master. During patch updating i mistakenly added double counting of deallocatated blocks. That's why the tests in the patch tester failed. Fixed it and squashed fix 0002 with 0001. Here is fixed version. With the best wishes! -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres CompanyFrom 7a0925a38acfb9a945087318a5d91fae4680db0e Mon Sep 17 00:00:00 2001 From: Tomas Vondra Date: Tue, 26 Dec 2023 17:54:40 +0100 Subject: [PATCH 1/2] Add tracking of backend memory allocated Add tracking of backend memory allocated in total and by allocation type (aset, dsm, generation, slab) by process. allocated_bytes tracks the current bytes of memory allocated to the backend process. aset_allocated_bytes, dsm_allocated_bytes, generation_allocated_bytes and slab_allocated_bytes track the allocation by type for the backend process. They are updated for the process as memory is malloc'd/freed. Memory allocated to items on the freelist is included. Dynamic shared memory allocations are included only in the value displayed for the backend that created them, they are not included in the value for backends that are attached to them to avoid double counting. DSM allocations that are not destroyed by the creating process prior to it's exit are considered long lived and are tracked in a global counter global_dsm_allocated_bytes. We limit the floor of allocation counters to zero. Created views pg_stat_global_memory_allocation and pg_stat_memory_allocation for access to these trackers. --- doc/src/sgml/monitoring.sgml| 246 src/backend/catalog/system_views.sql| 34 +++ src/backend/storage/ipc/dsm.c | 11 +- src/backend/storage/ipc/dsm_impl.c | 78 +++ src/backend/storage/lmgr/proc.c | 1 + src/backend/utils/activity/backend_status.c | 114 + src/backend/utils/adt/pgstatfuncs.c | 84 +++ src/backend/utils/init/miscinit.c | 3 + src/backend/utils/mmgr/aset.c | 17 ++ src/backend/utils/mmgr/generation.c | 13 ++ src/backend/utils/mmgr/slab.c | 22 ++ src/include/catalog/pg_proc.dat | 17 ++ src/include/storage/proc.h | 2 + src/include/utils/backend_status.h | 144 +++- src/test/regress/expected/rules.out | 27 +++ src/test/regress/expected/stats.out | 36 +++ src/test/regress/sql/stats.sql | 20 ++ 17 files changed, 867 insertions(+), 2 deletions(-) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 8736eac284..41d788be45 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -4599,6 +4599,252 @@ description | Waiting for a newly initialized WAL file to reach durable storage + + pg_stat_memory_allocation + + + pg_stat_memory_allocation + + + + The pg_stat_memory_allocation view will have one + row per server process, showing information related to the current memory + allocation of that process in total and by allocator type. Due to the + dynamic nature of memory allocations the allocated bytes values may not be + exact but should be sufficient for the intended purposes. Dynamic shared + memory allocations are included only in the value displayed for the backend + that created them, they are not included in the value for backends that are + attached to them to avoid double counting. Use + pg_size_pretty described in +to make these values more easily + readable. + + + + pg_stat_memory_allocation View + + + + + Column Type + + + Description + + + + + + + + datid oid + + + OID of the database this backend is connected to + + + + + + pid integer + + + Process ID of this backend + + + + + + allocated_bytes bigint + + + Memory currently allocated to this backend in bytes. This is the balance + of bytes allocated and freed by this backend. Dynamic shared memory + allocations are included only in the value displayed for the backend that + created them, they are not included in the value for backends that are + attached to them to avoid double counting. + + + + + + aset_allocated_bytes bigint + + + Memory currently allocated to this backend in bytes via the allocation + set allocator. + + + + + + dsm_allocated_bytes bigint + + + Memory currently allocated to this backend in bytes via the dynamic + shared memory allocator. Upon process exit, dsm allocations that have + not been freed are considered long lived and added to + global_dsm_allocated_bytes
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On 12.03.2024 16:30, Aleksander Alekseev wrote: Just wanted to let you know that v20231226 doesn't apply. The patch needs love from somebody interested in it. Thanks for pointing to this! Here is a version updated for the current master. With the best regards, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres CompanyFrom e31012217fb63f57fe16d7232fec27bb6cd45597 Mon Sep 17 00:00:00 2001 From: Tomas Vondra Date: Tue, 26 Dec 2023 17:54:40 +0100 Subject: [PATCH 1/3] Add tracking of backend memory allocated Add tracking of backend memory allocated in total and by allocation type (aset, dsm, generation, slab) by process. allocated_bytes tracks the current bytes of memory allocated to the backend process. aset_allocated_bytes, dsm_allocated_bytes, generation_allocated_bytes and slab_allocated_bytes track the allocation by type for the backend process. They are updated for the process as memory is malloc'd/freed. Memory allocated to items on the freelist is included. Dynamic shared memory allocations are included only in the value displayed for the backend that created them, they are not included in the value for backends that are attached to them to avoid double counting. DSM allocations that are not destroyed by the creating process prior to it's exit are considered long lived and are tracked in a global counter global_dsm_allocated_bytes. We limit the floor of allocation counters to zero. Created views pg_stat_global_memory_allocation and pg_stat_memory_allocation for access to these trackers. --- doc/src/sgml/monitoring.sgml| 246 src/backend/catalog/system_views.sql| 34 +++ src/backend/storage/ipc/dsm.c | 11 +- src/backend/storage/ipc/dsm_impl.c | 78 +++ src/backend/storage/lmgr/proc.c | 1 + src/backend/utils/activity/backend_status.c | 114 + src/backend/utils/adt/pgstatfuncs.c | 84 +++ src/backend/utils/init/miscinit.c | 3 + src/backend/utils/mmgr/aset.c | 17 ++ src/backend/utils/mmgr/generation.c | 14 ++ src/backend/utils/mmgr/slab.c | 22 ++ src/include/catalog/pg_proc.dat | 17 ++ src/include/storage/proc.h | 2 + src/include/utils/backend_status.h | 144 +++- src/test/regress/expected/rules.out | 27 +++ src/test/regress/expected/stats.out | 36 +++ src/test/regress/sql/stats.sql | 20 ++ 17 files changed, 868 insertions(+), 2 deletions(-) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 8aca08140e..5f827fe0b0 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -4598,6 +4598,252 @@ description | Waiting for a newly initialized WAL file to reach durable storage + + pg_stat_memory_allocation + + + pg_stat_memory_allocation + + + + The pg_stat_memory_allocation view will have one + row per server process, showing information related to the current memory + allocation of that process in total and by allocator type. Due to the + dynamic nature of memory allocations the allocated bytes values may not be + exact but should be sufficient for the intended purposes. Dynamic shared + memory allocations are included only in the value displayed for the backend + that created them, they are not included in the value for backends that are + attached to them to avoid double counting. Use + pg_size_pretty described in +to make these values more easily + readable. + + + + pg_stat_memory_allocation View + + + + + Column Type + + + Description + + + + + + + + datid oid + + + OID of the database this backend is connected to + + + + + + pid integer + + + Process ID of this backend + + + + + + allocated_bytes bigint + + + Memory currently allocated to this backend in bytes. This is the balance + of bytes allocated and freed by this backend. Dynamic shared memory + allocations are included only in the value displayed for the backend that + created them, they are not included in the value for backends that are + attached to them to avoid double counting. + + + + + + aset_allocated_bytes bigint + + + Memory currently allocated to this backend in bytes via the allocation + set allocator. + + + + + + dsm_allocated_bytes bigint + + + Memory currently allocated to this backend in bytes via the dynamic + shared memory allocator. Upon process exit, dsm allocations that have + not been freed are considered long lived and added to + global_dsm_allocated_bytes found in the + +
Re: Add the ability to limit the amount of memory that can be allocated to backends.
Hi, > I took a closer look at this patch over the last couple days, and I did > a bunch of benchmarks to see how expensive the accounting is. The good > news is that I haven't observed any overhead - I did two simple tests, > that I think would be particularly sensitive to this: > > [...] Just wanted to let you know that v20231226 doesn't apply. The patch needs love from somebody interested in it. Best regards, Aleksander Alekseev (wearing a co-CFM hat)
Re: Add the ability to limit the amount of memory that can be allocated to backends.
Hi, I wanted to take a look at the patch, and I noticed it's broken since 3d51cb5197 renamed a couple pgstat functions in August. I plan to maybe do some benchmarks etc. preferably on current master, so here's a version fixing that minor bitrot. As for the patch, I only skimmed through the thread so far, to get some idea of what the approach and goals are, etc. I didn't look at the code yet, so can't comment on that. However, at pgconf.eu a couple week ago I had quite a few discussions about such "backend memory limit" could/should work in principle, and I've been thinking about ways to implement this. So let me share some thoughts about how this patch aligns with that ... (FWIW it's not my intent to hijack or derail this patch in any way, but there's a couple things I think we should do differently.) I'm 100% on board with having a memory limit "above" work_mem. It's really annoying that we have no way to restrict the amount of memory a backend can allocate for complex queries, etc. But I find it a bit strange that we aim to introduce a "global" memory limit for all backends combined first. I'm not against having that too, but it's not the feature I usually wish to have. I need some protection against runaway backends, that happen to allocate a lot memory. Similarly, I'd like to be able to have different limits depending on what the backend does - a backend doing OLAP may naturally need more memory, while a backend doing OLTP may have a much tighter limit. But with a single global limit none of this is possible. It may help reducing the risk of unexpected OOM issues (not 100%, but useful), but it can't limit the impact to the one backend - if memory starts runnning out, it will affect all other backends a bit randomly (depending on the order in which the backends happen to allocate memory). And it does not consider what workloads the backends execute. Let me propose a slightly different architecture that I imagined while thinking about this. It's not radically differrent from what the patch does, but it focuses on the local accounting first. I believe it's possible to extend this to enforce the global limit too. FWIW I haven't tried implementing this - I don't want to "hijack" this thread and do my own thing. I can take a stab at a PoC if needed. Firstly, I'm not quite happy with how all the memory contexts have to do their own version of the accounting and memory checking. I think we should move that into a new abstraction which I call "memory pool". It's very close to "memory context" but it only deals with allocating blocks, not the chunks requested by palloc() etc. So when someone does palloc(), that may be AllocSetAlloc(). And instead of doing malloc() that would do MemoryPoolAlloc(blksize), and then that would do all the accounting and checks, and then do malloc(). This may sound like an unnecessary indirection, but the idea is that a single memory pool would back many memory contexts (perhaps all for a given backend). In principle we might even establish separate memory pools for different parts of the memory context hierarchy, but I'm not sure we need that. I can imagine the pool could also cache blocks for cases when we create and destroy contexts very often, but glibc should already does that for us, I believe. For me, the accounting and memory context is the primary goal. I wonder if we considered this context/pool split while working on the accounting for hash aggregate, but I think we were too attached to doing all of it in the memory context hierarchy. Of course, this memory pool is per backend, and so would be the memory accounting and limit enforced by it. But I can imagine extending to do a global limit similar to what the current patch does - using a counter in shared memory, or something. I haven't reviewed what's the overhead or how it handles cases when a backend terminates in some unexpected way. But whatever the current patch does, memory pool could do too. Secondly, I think there's an annoying issue with the accounting at the block level - it makes it problematic to use low limit values. We double the block size, so we may quickly end up with a block size a couple MBs, which means the accounting granularity gets very coarse. I think it'd be useful to introduce a "backpressure" between the memory pool and the memory context, depending on how close we are to the limit. For example if the limit is 128MB and the backend allocated 16MB so far, we're pretty far away from the limit. So if the backend requests 8MB block, that's fine and the memory pool should malloc() that. But if we already allocated 100MB, maybe we should be careful and not allow 8MB blocks - the memory pool should be allowed to override this and return just 1MB block. Sure, this would have to be optional, and not all places can accept a smaller block than requested (when the chunk would not fit into the smaller block). It would require a suitable memory pool API and more work in the memory contexts,
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On 12/26/23 11:49, Anton A. Melnikov wrote: > Hello! > > Earlier in this thread, the pgbench results were published, where with a > strong memory limit of 100MB > a significant, about 10%, decrease in TPS was observed [1]. > > Using dedicated server with 12GB RAM and methodology described in [3], i > performed five series > of measurements for the patches from the [2]. Can you share some info about the hardware? For example the CPU model, number of cores, and so on. 12GB RAM is not quite huge, so presumably it was a small machine. > The series were like this: > 1) unpatched 16th version at the REL_16_BETA1 (e0b82fc8e83) as close to > [2] in time. > 2) patched REL_16_BETA1 at e0b82fc8e83 with undefined > max_total_backend_memory GUC (with default value = 0). > 3) patched REL_16_BETA1 with max_total_backend_memory = 16GB > 4) the same with max_total_backend_memory = 8GB > 5) and again with max_total_backend_memory = 200MB > OK > Measurements with max_total_backend_memory = 100MB were not be carried out, > with limit 100MB the server gave an error on startup: > FATAL: configured max_total_backend_memory 100MB is <= > shared_memory_size 143MB > So i used 200MB to retain all other GUCs the same. > I'm not very familiar with the patch yet, but this seems a bit strange. Why should shared_buffers be included this limit? > Pgbench gave the following results: > 1) and 2) almost the same: ~6350 TPS. See orange and green > distributions on the attached graph.png respectively. > 3) and 4) identical to each other (~6315 TPS) and a bit slower than 1) > and 2) by ~0,6%. > See blue and yellow distributions respectively. > 5) is slightly slower (~6285 TPS) than 3) and 4) by another 0,5%. (grey > distribution) > The standard error in all series was ~0.2%. There is a raw data in the > raw_data.txt. > I think 6350 is a pretty terrible number, especially for scale 8, which is maybe 150MB of data. I think that's a pretty clear sign the system was hitting some other bottleneck, which can easily mask regressions in the memory allocation code. AFAICS the pgbench runs were regular r/w benchmarks, so I'd bet it was hitting I/O, and possibly even subject to some random effects at that level. I think what would be interesting are runs with pgbench -M prepared -S -c $N -j $N i.e. read-only tests (to not hit I/O), and $N being sufficiently large to maybe also show some concurrency/locking bottlenecks, etc. I may do some benchmarks if I happen to find a bit of time, but maybe you could collect such numbers too? The other benchmark that might be interesting is more OLAP, with low concurrency but backends allocating a lot of memory. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Add the ability to limit the amount of memory that can be allocated to backends.
hi. +static void checkAllocations(); should be "static void checkAllocations(void);" ? PgStatShared_Memtrack there is a lock, but seems not initialized, and not used. Can you expand on it? So in view pg_stat_global_memory_tracking, column "total_memory_reserved" is a point of time, total memory the whole server reserved/malloced? will it change every time you call it? the function pg_stat_get_global_memory_tracking provolatile => 's'. should be a VOLATILE function? pg_stat_get_memory_reservation, pg_stat_get_global_memory_tracking should be proretset => 'f'. +{ oid => '9891', + descr => 'statistics: memory utilized by current backend', + proname => 'pg_get_backend_memory_allocation', prorows => '1', proisstrict => 'f', + proretset => 't', provolatile => 's', proparallel => 'r', you declared +void pgstat_backend_memory_reservation_cb(void); but seems there is no definition. this part is unnecessary since you already declared src/include/catalog/pg_proc.dat? +/* SQL Callable functions */ +extern Datum pg_stat_get_memory_reservation(PG_FUNCTION_ARGS); +extern Datum pg_get_backend_memory_allocation(PG_FUNCTION_ARGS); +extern Datum pg_stat_get_global_memory_tracking(PG_FUNCTION_ARGS); The last sentence is just a plain link, no explanation. something is missing? + Reports how much memory remains available to the server. If a + backend process attempts to allocate more memory than remains, + the process will fail with an out of memory error, resulting in + cancellation of the process's active query/transaction. + If memory is not being limited (ie. max_total_memory is zero or not set), + this column returns NULL. + . + + + + + +static_shared_memory bigint + + + Reports how much static shared memory (non-DSM shared memory) is being used by + the server. Static shared memory is configured by the postmaster at + at server startup. + . + +
Re: Add the ability to limit the amount of memory that can be allocated to backends.
Hi, On 2023-11-07 15:55:48 -0500, Stephen Frost wrote: > * Andres Freund (and...@anarazel.de) wrote: > > On 2023-11-06 13:02:50 -0500, Stephen Frost wrote: > > > > > The max_total_memory limit is checked whenever the global counters are > > > > > updated. There is no special error handling if a memory allocation > > > > > exceeds > > > > > the global limit. That allocation returns a NULL for malloc style > > > > > allocations or an ENOMEM for shared memory allocations. Postgres has > > > > > existing mechanisms for dealing with out of memory conditions. > > > > > > > > I still think it's extremely unwise to do tracking of memory and > > > > limiting of > > > > memory in one patch. You should work towards and acceptable patch that > > > > just > > > > tracks memory usage in an as simple and low overhead way as possible. > > > > Then we > > > > later can build on that. > > > > > > Frankly, while tracking is interesting, the limiting is the feature > > > that's needed more urgently imv. > > > > I agree that we need limiting, but that the tracking needs to be very robust > > for that to be usable. > > Is there an issue with the tracking in the patch that you saw? That's > certainly an area that we've tried hard to get right and to match up to > numbers from the rest of the system, such as the memory context system. There's some details I am pretty sure aren't right - the DSM tracking piece seems bogus to me. But beyond that: I don't know. There's enough other stuff in the patch that it's hard to focus on that aspect. That's why I'd like to merge a patch doing just that, so we actually can collect numbers. If any of the developers of the patch had focused on polishing that part instead of focusing on the limiting, it'd have been ready to be merged a while ago, maybe even in 16. I think the limiting piece is unlikely to be ready for 17. Greetings, Andres Freund
Re: Add the ability to limit the amount of memory that can be allocated to backends.
Greetings, * Andres Freund (and...@anarazel.de) wrote: > On 2023-11-06 13:02:50 -0500, Stephen Frost wrote: > > > > The max_total_memory limit is checked whenever the global counters are > > > > updated. There is no special error handling if a memory allocation > > > > exceeds > > > > the global limit. That allocation returns a NULL for malloc style > > > > allocations or an ENOMEM for shared memory allocations. Postgres has > > > > existing mechanisms for dealing with out of memory conditions. > > > > > > I still think it's extremely unwise to do tracking of memory and limiting > > > of > > > memory in one patch. You should work towards and acceptable patch that > > > just > > > tracks memory usage in an as simple and low overhead way as possible. > > > Then we > > > later can build on that. > > > > Frankly, while tracking is interesting, the limiting is the feature > > that's needed more urgently imv. > > I agree that we need limiting, but that the tracking needs to be very robust > for that to be usable. Is there an issue with the tracking in the patch that you saw? That's certainly an area that we've tried hard to get right and to match up to numbers from the rest of the system, such as the memory context system. > > We could possibly split it up but there's an awful lot of the same code that > > would need to be changed and that seems less than ideal. Still, we'll look > > into this. > > Shrug. IMO keeping them together just makes it very likely that neither goes > in. I'm happy to hear your support for the limiting part of this- that's encouraging. > > > > For sanity checking, pgstat now includes the > > > > pg_backend_memory_allocation > > > > view showing memory allocations made by the backend process. This view > > > > includes a scan of the top memory context, so it compares memory > > > > allocations > > > > reported through pgstat with actual allocations. The two should match. > > > > > > Can't you just do this using the existing pg_backend_memory_contexts view? > > > > Not and get a number that you can compare to the local backend number > > due to the query itself happening and performing allocations and > > creating new contexts. We wanted to be able to show that we are > > accounting correctly and exactly matching to what the memory context > > system is tracking. > > I think creating a separate view for this will be confusing for users, without > really much to show for. Excluding the current query would be useful for other > cases as well, why don't we provide a way to do that with > pg_backend_memory_contexts? Both of these feel very much like power-user views, so I'm not terribly concerned about users getting confused. That said, we could possibly drop this as a view and just have the functions which are then used in the regression tests to catch things should the numbers start to diverge. Having a way to get the memory contexts which don't include the currently running query might be interesting too but is rather independent of what this patch is trying to do. The only reason we collected up the memory-context info is as a cross-check to the tracking that we're doing and while the existing memory-context view is just fine for a lot of other things, it doesn't work for that specific need. Thanks, Stephen signature.asc Description: PGP signature
Re: Add the ability to limit the amount of memory that can be allocated to backends.
Hi, On 2023-11-06 13:02:50 -0500, Stephen Frost wrote: > > > The max_total_memory limit is checked whenever the global counters are > > > updated. There is no special error handling if a memory allocation exceeds > > > the global limit. That allocation returns a NULL for malloc style > > > allocations or an ENOMEM for shared memory allocations. Postgres has > > > existing mechanisms for dealing with out of memory conditions. > > > > I still think it's extremely unwise to do tracking of memory and limiting of > > memory in one patch. You should work towards and acceptable patch that just > > tracks memory usage in an as simple and low overhead way as possible. Then > > we > > later can build on that. > > Frankly, while tracking is interesting, the limiting is the feature > that's needed more urgently imv. I agree that we need limiting, but that the tracking needs to be very robust for that to be usable. > We could possibly split it up but there's an awful lot of the same code that > would need to be changed and that seems less than ideal. Still, we'll look > into this. Shrug. IMO keeping them together just makes it very likely that neither goes in. > > > For sanity checking, pgstat now includes the pg_backend_memory_allocation > > > view showing memory allocations made by the backend process. This view > > > includes a scan of the top memory context, so it compares memory > > > allocations > > > reported through pgstat with actual allocations. The two should match. > > > > Can't you just do this using the existing pg_backend_memory_contexts view? > > Not and get a number that you can compare to the local backend number > due to the query itself happening and performing allocations and > creating new contexts. We wanted to be able to show that we are > accounting correctly and exactly matching to what the memory context > system is tracking. I think creating a separate view for this will be confusing for users, without really much to show for. Excluding the current query would be useful for other cases as well, why don't we provide a way to do that with pg_backend_memory_contexts? Greetings, Andres Freund
Re: Add the ability to limit the amount of memory that can be allocated to backends.
Greetings, * Andres Freund (and...@anarazel.de) wrote: > On 2023-10-31 17:11:26 +, John Morris wrote: > > Postgres memory reservations come from multiple sources. > > > > * Malloc calls made by the Postgres memory allocators. > > * Static shared memory created by the postmaster at server startup, > > * Dynamic shared memory created by the backends. > > * A fixed amount (1Mb) of “initial” memory reserved whenever a process > > starts up. > > > > Each process also maintains an accurate count of its actual memory > > allocations. The process-private variable “my_memory” holds the total > > allocations for that process. Since there can be no contention, each process > > updates its own counters very efficiently. > > I think this will introduce measurable overhead in low concurrency cases and > very substantial overhead / contention when there is a decent amount of > concurrency. This makes all memory allocations > 1MB contend on a single > atomic. Massive amount of energy have been spent writing multi-threaded > allocators that have far less contention than this - the current state is to > never contend on shared resources on any reasonably common path. This gives > away one of the few major advantages our process model has away. We could certainly adjust the size of each reservation to reduce the frequency of having to hit the atomic. Specific suggestions about how to benchmark and see the regression that's being worried about here would be great. Certainly my hope has generally been that when we do a larger allocation, it's because we're about to go do a bunch of work, meaning that hopefully the time spent updating the atomic is minor overall. > The patch doesn't just introduce contention when limiting is enabled - it > introduces it even when memory usage is just tracked. It makes absolutely no > sense to have a single contended atomic in that case - just have a per-backend > variable in shared memory that's updated. It's *WAY* cheaper to compute the > overall memory usage during querying than to keep a running total in shared > memory. Agreed that we should avoid the contention when limiting isn't being used, certainly easy to do so, and had actually intended to but that seems to have gotten lost along the way. Will fix. Other than that change inside update_global_reservation though, the code for reporting per-backend memory usage and querying it does work as you're outlining above inside the stats system. That said- I just want to confirm that you would agree that querying the amount of memory used by every backend, to add it all up to enforce an overall limit, surely isn't something we're going to want to do during an allocation and that having a global atomic for that is better, right? > > Pgstat now includes global memory counters. These shared memory counters > > represent the sum of all reservations made by all Postgres processes. For > > efficiency, these global counters are only updated when new reservations > > exceed a threshold, currently 1 Mb for each process. Consequently, the > > global reservation counters are approximate totals which may differ from the > > actual allocation totals by up to 1 Mb per process. > > I see that you added them to the "cumulative" stats system - that doesn't > immediately makes sense to me - what you're tracking here isn't an > accumulating counter, it's something showing the current state, right? Yes, this is current state, not an accumulation. > > The max_total_memory limit is checked whenever the global counters are > > updated. There is no special error handling if a memory allocation exceeds > > the global limit. That allocation returns a NULL for malloc style > > allocations or an ENOMEM for shared memory allocations. Postgres has > > existing mechanisms for dealing with out of memory conditions. > > I still think it's extremely unwise to do tracking of memory and limiting of > memory in one patch. You should work towards and acceptable patch that just > tracks memory usage in an as simple and low overhead way as possible. Then we > later can build on that. Frankly, while tracking is interesting, the limiting is the feature that's needed more urgently imv. We could possibly split it up but there's an awful lot of the same code that would need to be changed and that seems less than ideal. Still, we'll look into this. > > For sanity checking, pgstat now includes the pg_backend_memory_allocation > > view showing memory allocations made by the backend process. This view > > includes a scan of the top memory context, so it compares memory allocations > > reported through pgstat with actual allocations. The two should match. > > Can't you just do this using the existing pg_backend_memory_contexts view? Not and get a number that you can compare to the local backend number due to the query itself happening and performing allocations and creating new contexts. We wanted to be able to show that we are accounting
Re: Add the ability to limit the amount of memory that can be allocated to backends.
Hi, On 2023-10-31 17:11:26 +, John Morris wrote: > Postgres memory reservations come from multiple sources. > > * Malloc calls made by the Postgres memory allocators. > * Static shared memory created by the postmaster at server startup, > * Dynamic shared memory created by the backends. > * A fixed amount (1Mb) of “initial” memory reserved whenever a process > starts up. > > Each process also maintains an accurate count of its actual memory > allocations. The process-private variable “my_memory” holds the total > allocations for that process. Since there can be no contention, each process > updates its own counters very efficiently. I think this will introduce measurable overhead in low concurrency cases and very substantial overhead / contention when there is a decent amount of concurrency. This makes all memory allocations > 1MB contend on a single atomic. Massive amount of energy have been spent writing multi-threaded allocators that have far less contention than this - the current state is to never contend on shared resources on any reasonably common path. This gives away one of the few major advantages our process model has away. The patch doesn't just introduce contention when limiting is enabled - it introduces it even when memory usage is just tracked. It makes absolutely no sense to have a single contended atomic in that case - just have a per-backend variable in shared memory that's updated. It's *WAY* cheaper to compute the overall memory usage during querying than to keep a running total in shared memory. > Pgstat now includes global memory counters. These shared memory counters > represent the sum of all reservations made by all Postgres processes. For > efficiency, these global counters are only updated when new reservations > exceed a threshold, currently 1 Mb for each process. Consequently, the > global reservation counters are approximate totals which may differ from the > actual allocation totals by up to 1 Mb per process. I see that you added them to the "cumulative" stats system - that doesn't immediately makes sense to me - what you're tracking here isn't an accumulating counter, it's something showing the current state, right? > The max_total_memory limit is checked whenever the global counters are > updated. There is no special error handling if a memory allocation exceeds > the global limit. That allocation returns a NULL for malloc style > allocations or an ENOMEM for shared memory allocations. Postgres has > existing mechanisms for dealing with out of memory conditions. I still think it's extremely unwise to do tracking of memory and limiting of memory in one patch. You should work towards and acceptable patch that just tracks memory usage in an as simple and low overhead way as possible. Then we later can build on that. > For sanity checking, pgstat now includes the pg_backend_memory_allocation > view showing memory allocations made by the backend process. This view > includes a scan of the top memory context, so it compares memory allocations > reported through pgstat with actual allocations. The two should match. Can't you just do this using the existing pg_backend_memory_contexts view? > Performance-wise, there was no measurable impact with either pgbench or a > simple “SELECT * from series” query. That seems unsurprising - allocations aren't a major part of the work there, you'd have to regress by a lot to see memory allocator changes to show a significant performance decrease. > diff --git a/src/test/regress/expected/opr_sanity.out > b/src/test/regress/expected/opr_sanity.out > index 7a6f36a6a9..6c813ec465 100644 > --- a/src/test/regress/expected/opr_sanity.out > +++ b/src/test/regress/expected/opr_sanity.out > @@ -468,9 +468,11 @@ WHERE proallargtypes IS NOT NULL AND >ARRAY(SELECT proallargtypes[i] > FROM generate_series(1, array_length(proallargtypes, 1)) g(i) > WHERE proargmodes IS NULL OR proargmodes[i] IN ('i', 'b', 'v')); > - oid | proname | proargtypes | proallargtypes | proargmodes > --+-+-++- > -(0 rows) > + oid | proname | proargtypes | proallargtypes > |proargmodes > +--+--+-+---+--- > + 9890 | pg_stat_get_memory_reservation | | > {23,23,20,20,20,20,20,20} | {i,o,o,o,o,o,o,o} > + 9891 | pg_get_backend_memory_allocation | | > {23,23,20,20,20,20,20}| {i,o,o,o,o,o,o} > +(2 rows) This indicates that your pg_proc entries are broken, they need to fixed rather than allowed here. Greetings, Andres Freund
Re: Add the ability to limit the amount of memory that can be allocated to backends.
Hi, On 2023-10-24 09:39:42 +0700, Andrei Lepikhov wrote: > On 20/10/2023 19:39, Stephen Frost wrote: > Greetings, > > * Andrei Lepikhov (a.lepik...@postgrespro.ru) wrote: > > > The only issue I worry about is the uncertainty and clutter that can be > > > created by this feature. In the worst case, when we have a complex error > > > stack (including the extension's CATCH sections, exceptions in stored > > > procedures, etc.), the backend will throw the memory limit error > > > repeatedly. > > > > I'm not seeing what additional uncertainty or clutter there is- this is, > > again, exactly the same as what happens today on a system with > > overcommit disabled and I don't feel like we get a lot of complaints > > about this today. > > Maybe I missed something or see this feature from an alternate point of view > (as an extension developer), but overcommit is more useful so far: it kills > a process. In case of postgres it doesn't just kill one postgres, it leads to *all* connections being terminated. > It means that after restart, the backend/background worker will have an > initial internal state. With this limit enabled, we need to remember that > each function call can cause an error, and we have to remember it using > static PG_CATCH sections where we must rearrange local variables to the > initial (?) state. So, it complicates development. You need to be aware of errors being thrown regardless this feature, as out-of-memory errors can be encountered today already. There also are many other kinds of errors that can be thrown. Greetings, Andres Freund
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On 20/10/2023 19:39, Stephen Frost wrote: Greetings, * Andrei Lepikhov (a.lepik...@postgrespro.ru) wrote: The only issue I worry about is the uncertainty and clutter that can be created by this feature. In the worst case, when we have a complex error stack (including the extension's CATCH sections, exceptions in stored procedures, etc.), the backend will throw the memory limit error repeatedly. I'm not seeing what additional uncertainty or clutter there is- this is, again, exactly the same as what happens today on a system with overcommit disabled and I don't feel like we get a lot of complaints about this today. Maybe I missed something or see this feature from an alternate point of view (as an extension developer), but overcommit is more useful so far: it kills a process. It means that after restart, the backend/background worker will have an initial internal state. With this limit enabled, we need to remember that each function call can cause an error, and we have to remember it using static PG_CATCH sections where we must rearrange local variables to the initial (?) state. So, it complicates development. Of course, this limit is a good feature, but from my point of view, it would be better to kill a memory-consuming backend instead of throwing an error. At least for now, we don't have a technique to repeat query planning with chances to build a more effective plan. -- regards, Andrei Lepikhov Postgres Professional
Re: Add the ability to limit the amount of memory that can be allocated to backends.
Greetings, * Andrei Lepikhov (a.lepik...@postgrespro.ru) wrote: > The only issue I worry about is the uncertainty and clutter that can be > created by this feature. In the worst case, when we have a complex error > stack (including the extension's CATCH sections, exceptions in stored > procedures, etc.), the backend will throw the memory limit error repeatedly. I'm not seeing what additional uncertainty or clutter there is- this is, again, exactly the same as what happens today on a system with overcommit disabled and I don't feel like we get a lot of complaints about this today. > Of course, one failed backend looks better than a surprisingly killed > postmaster, but the mix of different error reports and details looks > terrible and challenging to debug in the case of trouble. So, may we throw a > FATAL error if we reach this limit while handling an exception? I don't see why we'd do that when we can do better- we just fail whatever the ongoing query or transaction is and allow further requests on the same connection. We already support exactly that and it works really rather well and I don't see why we'd throw that away because there's a different way to get an OOM error. If you want to make the argument that we should throw FATAL on OOM when handling an exception, that's something you could argue independently of this effort already today, but I don't think you'll get agreement that it's an improvement. Thanks, Stephen signature.asc Description: PGP signature
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On 20/10/2023 05:06, Stephen Frost wrote: Greetings, * Andrei Lepikhov (a.lepik...@postgrespro.ru) wrote: On 19/10/2023 02:00, Stephen Frost wrote: * Andrei Lepikhov (a.lepik...@postgrespro.ru) wrote: On 29/9/2023 09:52, Andrei Lepikhov wrote: On 22/5/2023 22:59, reid.thomp...@crunchydata.com wrote: Attach patches updated to master. Pulled from patch 2 back to patch 1 a change that was also pertinent to patch 1. +1 to the idea, have doubts on the implementation. I have a question. I see the feature triggers ERROR on the exceeding of the memory limit. The superior PG_CATCH() section will handle the error. As I see, many such sections use memory allocations. What if some routine, like the CopyErrorData(), exceeds the limit, too? In this case, we could repeat the error until the top PG_CATCH(). Is this correct behaviour? Maybe to check in the exceeds_max_total_bkend_mem() for recursion and allow error handlers to slightly exceed this hard limit? By the patch in attachment I try to show which sort of problems I'm worrying about. In some PП_CATCH() sections we do CopyErrorData (allocate some memory) before aborting the transaction. So, the allocation error can move us out of this section before aborting. We await for soft ERROR message but will face more hard consequences. While it's an interesting idea to consider making exceptions to the limit, and perhaps we'll do that (or have some kind of 'reserve' for such cases), this isn't really any different than today, is it? We might have a malloc() failure in the main path, end up in PG_CATCH() and then try to do a CopyErrorData() and have another malloc() failure. If we can rearrange the code to make this less likely to happen, by doing a bit more work to free() resources used in the main path before trying to do new allocations, then, sure, let's go ahead and do that, but that's independent from this effort. I agree that rearranging efforts can be made independently. The code in the letter above was shown just as a demo of the case I'm worried about. IMO, the thing that should be implemented here is a recursion level for the memory limit. If processing the error, we fall into recursion with this limit - we should ignore it. I imagine custom extensions that use PG_CATCH() and allocate some data there. At least we can raise the level of error to FATAL. Ignoring such would defeat much of the point of this effort- which is to get to a position where we can say with some confidence that we're not going to go over some limit that the user has set and therefore not allow ourselves to end up getting OOM killed. These are all the same issues that already exist today on systems which don't allow overcommit too, there isn't anything new here in regards to these risks, so I'm not really keen to complicate this to deal with issues that are already there. Perhaps once we've got the basics in place then we could consider reserving some space for handling such cases.. but I don't think it'll actually be very clean and what if we have an allocation that goes beyond what that reserved space is anyway? Then we're in the same spot again where we have the choice of either failing the allocation in a less elegant way than we might like to handle that error, or risk getting outright kill'd by the kernel. Of those choices, sure seems like failing the allocation is the better way to go. I've got your point. The only issue I worry about is the uncertainty and clutter that can be created by this feature. In the worst case, when we have a complex error stack (including the extension's CATCH sections, exceptions in stored procedures, etc.), the backend will throw the memory limit error repeatedly. Of course, one failed backend looks better than a surprisingly killed postmaster, but the mix of different error reports and details looks terrible and challenging to debug in the case of trouble. So, may we throw a FATAL error if we reach this limit while handling an exception? -- regards, Andrey Lepikhov Postgres Professional
Re: Add the ability to limit the amount of memory that can be allocated to backends.
Greetings, * Andres Freund (and...@anarazel.de) wrote: > On 2023-10-19 18:06:10 -0400, Stephen Frost wrote: > > Ignoring such would defeat much of the point of this effort- which is to > > get to a position where we can say with some confidence that we're not > > going to go over some limit that the user has set and therefore not > > allow ourselves to end up getting OOM killed. > > I think that is a good medium to long term goal. I do however think that we'd > be better off merging the visibility of memory allocations soon-ish and > implement the limiting later. There's a lot of hairy details to get right for > the latter, and even just having visibility will be a huge improvement. I agree that having the visibility will be a great improvement and perhaps could go in separately, but I don't know that I agree that the limits are going to be that much of an issue. In any case, there's been work ongoing on this and that'll be posted soon. I was just trying to address the general comment raised in this sub-thread here. > I think even patch 1 is doing too much at once. I doubt the DSM stuff is > quite right. Getting DSM right has certainly been tricky, along with other things, but we've been working towards, and continue to work towards, getting everything to line up nicely between memory context allocations of various types and the amounts which are being seen as malloc'd/free'd. There's been parts of this also reworked to allow us to see per-backend reservations as well as total reserved and to get those numbers able to be matched up inside of a given transaction using the statistics system. > I'm unconvinced it's a good idea to split the different types of memory > contexts out. That just exposes too much implementation detail stuff without a > good reason. DSM needs to be independent anyway ... as for the others, perhaps we could combine them, though that's pretty easily done later and for now it's been useful to see them split out as we've been working on the patch. > I think the overhead even just the tracking implies right now is likely too > high and needs to be optimized. It should be a single math operation, not > tracking things in multiple fields. I don't think pg_sub_u64_overflow() should > be in the path either, that suddenly adds conditional branches. You really > ought to look at the difference in assembly for the hot functions. This has been improved in the most recent work and we'll have that posted soon, probably best to hold off from larger review of this right now- as mentioned, I was just trying to address the specific question in this sub-thread since a new patch is coming soon. Thanks, Stephen signature.asc Description: PGP signature
Re: Add the ability to limit the amount of memory that can be allocated to backends.
Hi, On 2023-10-19 18:06:10 -0400, Stephen Frost wrote: > Ignoring such would defeat much of the point of this effort- which is to > get to a position where we can say with some confidence that we're not > going to go over some limit that the user has set and therefore not > allow ourselves to end up getting OOM killed. I think that is a good medium to long term goal. I do however think that we'd be better off merging the visibility of memory allocations soon-ish and implement the limiting later. There's a lot of hairy details to get right for the latter, and even just having visibility will be a huge improvement. I think even patch 1 is doing too much at once. I doubt the DSM stuff is quite right. I'm unconvinced it's a good idea to split the different types of memory contexts out. That just exposes too much implementation detail stuff without a good reason. I think the overhead even just the tracking implies right now is likely too high and needs to be optimized. It should be a single math operation, not tracking things in multiple fields. I don't think pg_sub_u64_overflow() should be in the path either, that suddenly adds conditional branches. You really ought to look at the difference in assembly for the hot functions. Greetings, Andres Freund
Re: Add the ability to limit the amount of memory that can be allocated to backends.
Greetings, * Andrei Lepikhov (a.lepik...@postgrespro.ru) wrote: > On 19/10/2023 02:00, Stephen Frost wrote: > > * Andrei Lepikhov (a.lepik...@postgrespro.ru) wrote: > > > On 29/9/2023 09:52, Andrei Lepikhov wrote: > > > > On 22/5/2023 22:59, reid.thomp...@crunchydata.com wrote: > > > > > Attach patches updated to master. > > > > > Pulled from patch 2 back to patch 1 a change that was also pertinent > > > > > to patch 1. > > > > +1 to the idea, have doubts on the implementation. > > > > > > > > I have a question. I see the feature triggers ERROR on the exceeding of > > > > the memory limit. The superior PG_CATCH() section will handle the error. > > > > As I see, many such sections use memory allocations. What if some > > > > routine, like the CopyErrorData(), exceeds the limit, too? In this case, > > > > we could repeat the error until the top PG_CATCH(). Is this correct > > > > behaviour? Maybe to check in the exceeds_max_total_bkend_mem() for > > > > recursion and allow error handlers to slightly exceed this hard limit? > > > > > By the patch in attachment I try to show which sort of problems I'm > > > worrying > > > about. In some PП_CATCH() sections we do CopyErrorData (allocate some > > > memory) before aborting the transaction. So, the allocation error can move > > > us out of this section before aborting. We await for soft ERROR message > > > but > > > will face more hard consequences. > > > > While it's an interesting idea to consider making exceptions to the > > limit, and perhaps we'll do that (or have some kind of 'reserve' for > > such cases), this isn't really any different than today, is it? We > > might have a malloc() failure in the main path, end up in PG_CATCH() and > > then try to do a CopyErrorData() and have another malloc() failure. > > > > If we can rearrange the code to make this less likely to happen, by > > doing a bit more work to free() resources used in the main path before > > trying to do new allocations, then, sure, let's go ahead and do that, > > but that's independent from this effort. > > I agree that rearranging efforts can be made independently. The code in the > letter above was shown just as a demo of the case I'm worried about. > IMO, the thing that should be implemented here is a recursion level for the > memory limit. If processing the error, we fall into recursion with this > limit - we should ignore it. > I imagine custom extensions that use PG_CATCH() and allocate some data > there. At least we can raise the level of error to FATAL. Ignoring such would defeat much of the point of this effort- which is to get to a position where we can say with some confidence that we're not going to go over some limit that the user has set and therefore not allow ourselves to end up getting OOM killed. These are all the same issues that already exist today on systems which don't allow overcommit too, there isn't anything new here in regards to these risks, so I'm not really keen to complicate this to deal with issues that are already there. Perhaps once we've got the basics in place then we could consider reserving some space for handling such cases.. but I don't think it'll actually be very clean and what if we have an allocation that goes beyond what that reserved space is anyway? Then we're in the same spot again where we have the choice of either failing the allocation in a less elegant way than we might like to handle that error, or risk getting outright kill'd by the kernel. Of those choices, sure seems like failing the allocation is the better way to go. Thanks, Stephen signature.asc Description: PGP signature
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On 19/10/2023 02:00, Stephen Frost wrote: Greetings, * Andrei Lepikhov (a.lepik...@postgrespro.ru) wrote: On 29/9/2023 09:52, Andrei Lepikhov wrote: On 22/5/2023 22:59, reid.thomp...@crunchydata.com wrote: Attach patches updated to master. Pulled from patch 2 back to patch 1 a change that was also pertinent to patch 1. +1 to the idea, have doubts on the implementation. I have a question. I see the feature triggers ERROR on the exceeding of the memory limit. The superior PG_CATCH() section will handle the error. As I see, many such sections use memory allocations. What if some routine, like the CopyErrorData(), exceeds the limit, too? In this case, we could repeat the error until the top PG_CATCH(). Is this correct behaviour? Maybe to check in the exceeds_max_total_bkend_mem() for recursion and allow error handlers to slightly exceed this hard limit? By the patch in attachment I try to show which sort of problems I'm worrying about. In some PП_CATCH() sections we do CopyErrorData (allocate some memory) before aborting the transaction. So, the allocation error can move us out of this section before aborting. We await for soft ERROR message but will face more hard consequences. While it's an interesting idea to consider making exceptions to the limit, and perhaps we'll do that (or have some kind of 'reserve' for such cases), this isn't really any different than today, is it? We might have a malloc() failure in the main path, end up in PG_CATCH() and then try to do a CopyErrorData() and have another malloc() failure. If we can rearrange the code to make this less likely to happen, by doing a bit more work to free() resources used in the main path before trying to do new allocations, then, sure, let's go ahead and do that, but that's independent from this effort. I agree that rearranging efforts can be made independently. The code in the letter above was shown just as a demo of the case I'm worried about. IMO, the thing that should be implemented here is a recursion level for the memory limit. If processing the error, we fall into recursion with this limit - we should ignore it. I imagine custom extensions that use PG_CATCH() and allocate some data there. At least we can raise the level of error to FATAL. -- regards, Andrey Lepikhov Postgres Professional
Re: Add the ability to limit the amount of memory that can be allocated to backends.
Greetings, * Andrei Lepikhov (a.lepik...@postgrespro.ru) wrote: > On 29/9/2023 09:52, Andrei Lepikhov wrote: > > On 22/5/2023 22:59, reid.thomp...@crunchydata.com wrote: > > > Attach patches updated to master. > > > Pulled from patch 2 back to patch 1 a change that was also pertinent > > > to patch 1. > > +1 to the idea, have doubts on the implementation. > > > > I have a question. I see the feature triggers ERROR on the exceeding of > > the memory limit. The superior PG_CATCH() section will handle the error. > > As I see, many such sections use memory allocations. What if some > > routine, like the CopyErrorData(), exceeds the limit, too? In this case, > > we could repeat the error until the top PG_CATCH(). Is this correct > > behaviour? Maybe to check in the exceeds_max_total_bkend_mem() for > > recursion and allow error handlers to slightly exceed this hard limit? > By the patch in attachment I try to show which sort of problems I'm worrying > about. In some PП_CATCH() sections we do CopyErrorData (allocate some > memory) before aborting the transaction. So, the allocation error can move > us out of this section before aborting. We await for soft ERROR message but > will face more hard consequences. While it's an interesting idea to consider making exceptions to the limit, and perhaps we'll do that (or have some kind of 'reserve' for such cases), this isn't really any different than today, is it? We might have a malloc() failure in the main path, end up in PG_CATCH() and then try to do a CopyErrorData() and have another malloc() failure. If we can rearrange the code to make this less likely to happen, by doing a bit more work to free() resources used in the main path before trying to do new allocations, then, sure, let's go ahead and do that, but that's independent from this effort. Thanks! Stephen signature.asc Description: PGP signature
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On 29/9/2023 09:52, Andrei Lepikhov wrote: On 22/5/2023 22:59, reid.thomp...@crunchydata.com wrote: Attach patches updated to master. Pulled from patch 2 back to patch 1 a change that was also pertinent to patch 1. +1 to the idea, have doubts on the implementation. I have a question. I see the feature triggers ERROR on the exceeding of the memory limit. The superior PG_CATCH() section will handle the error. As I see, many such sections use memory allocations. What if some routine, like the CopyErrorData(), exceeds the limit, too? In this case, we could repeat the error until the top PG_CATCH(). Is this correct behaviour? Maybe to check in the exceeds_max_total_bkend_mem() for recursion and allow error handlers to slightly exceed this hard limit? By the patch in attachment I try to show which sort of problems I'm worrying about. In some PП_CATCH() sections we do CopyErrorData (allocate some memory) before aborting the transaction. So, the allocation error can move us out of this section before aborting. We await for soft ERROR message but will face more hard consequences. -- regards, Andrey Lepikhov Postgres Professional diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c index 33975687b3..3f992b8d92 100644 --- a/src/backend/executor/spi.c +++ b/src/backend/executor/spi.c @@ -291,10 +291,7 @@ _SPI_commit(bool chain) { ErrorData *edata; - /* Save error info in caller's context */ MemoryContextSwitchTo(oldcontext); - edata = CopyErrorData(); - FlushErrorState(); /* * Abort the failed transaction. If this fails too, we'll just @@ -302,6 +299,10 @@ _SPI_commit(bool chain) */ AbortCurrentTransaction(); + /* Save error info in caller's context */ + edata = CopyErrorData(); + FlushErrorState(); + /* ... and start a new one */ StartTransactionCommand(); if (chain) @@ -383,10 +384,7 @@ _SPI_rollback(bool chain) { ErrorData *edata; - /* Save error info in caller's context */ MemoryContextSwitchTo(oldcontext); - edata = CopyErrorData(); - FlushErrorState(); /* * Try again to abort the failed transaction. If this fails too, @@ -395,6 +393,10 @@ _SPI_rollback(bool chain) */ AbortCurrentTransaction(); + /* Save error info in caller's context */ + edata = CopyErrorData(); + FlushErrorState(); + /* ... and start a new one */ StartTransactionCommand(); if (chain) diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c index 12edc5772a..f9cf599026 100644 --- a/src/backend/replication/logical/reorderbuffer.c +++ b/src/backend/replication/logical/reorderbuffer.c @@ -2565,7 +2565,7 @@ ReorderBufferProcessTXN(ReorderBuffer *rb, ReorderBufferTXN *txn, PG_CATCH(); { MemoryContext ecxt = MemoryContextSwitchTo(ccxt); - ErrorData *errdata = CopyErrorData(); + ErrorData *errdata; /* TODO: Encapsulate cleanup from the PG_TRY and PG_CATCH blocks */ if (iterstate) @@ -2579,6 +2579,8 @@ ReorderBufferProcessTXN(ReorderBuffer *rb, ReorderBufferTXN *txn, */ AbortCurrentTransaction(); + errdata = CopyErrorData(); + /* make sure there's no cache pollution */ ReorderBufferExecuteInvalidations(txn->ninvalidations, txn->invalidations);
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On 22/5/2023 22:59, reid.thomp...@crunchydata.com wrote: Attach patches updated to master. Pulled from patch 2 back to patch 1 a change that was also pertinent to patch 1. +1 to the idea, have doubts on the implementation. I have a question. I see the feature triggers ERROR on the exceeding of the memory limit. The superior PG_CATCH() section will handle the error. As I see, many such sections use memory allocations. What if some routine, like the CopyErrorData(), exceeds the limit, too? In this case, we could repeat the error until the top PG_CATCH(). Is this correct behaviour? Maybe to check in the exceeds_max_total_bkend_mem() for recursion and allow error handlers to slightly exceed this hard limit? Also, the patch needs to be rebased. -- regards, Andrey Lepikhov Postgres Professional
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On Mon, 2023-05-22 at 08:42 -0400, reid.thomp...@crunchydata.com wrote: More followup to the above. > > I experimented on my system regarding > "The simple query select * from generate_series(0, 1000) shows roughly > 18.9 % degradation on my test server." > > My laptop: > 32GB ram > 11th Gen Intel(R) Core(TM) i7-11850H 8 cores/16 threads @ 2.50GHz (Max Turbo > Frequency. 4.80 GHz ; Cache. 24 MB) > SSD -> Model: KXG60ZNV1T02 NVMe KIOXIA 1024GB (nvme) Hi Ran through a few more tests on my system varying the initial_allocation_allowance and allocation_allowance_refill_qty from the current 1MB to 2, 4, 6, 8, 10 mb. Also realized that in my last tests/email I had posted percent difference rather than percent change. Turns out for the numbers that were being compared they're essentially the same, but I'm providing both for this set of tests. Ten runs for each comparison. Compared dev-max-memory set, dev-max-memory unset, master, and pg-stat-activity-backend-memory-allocated against master at each allocation value; Again, the test invokes psql -At -d postgres $connstr -P pager=off -c 'select * from generate_series(0, 1000)' 100 times on each of the 2 instances and calculates the AVG time and SD for the 100 runs. It then uses the AVG from each instance to calculate the percentage difference/change. These tests contain one code change not yet pushed to pgsql-hackers. In AllocSetReset() do not enter pgstat_report_allocated_bytes_decrease if no memory has been freed. Will format and post some pgbench test result in a separate email. Percent difference: ───┬─── │ Results: difference-dev-max-memory-set VS master ───┼─── 1 │ 1MB allocation2MB allocation4MB allocation 6MB allocation8MB allocation10MB allocation 2 │ 4.2263% difference 3.03961% difference 0.0585808% difference 2.92451% difference 3.34694% difference 2.67771% difference 3 │ 3.55709% difference 3.92339% difference 2.29144%difference 3.2156% difference 2.06153% difference 2.86217% difference 4 │ 2.04389% difference 2.91866% difference 3.73463%difference 2.86161% difference 3.60992% difference 3.07293% difference 5 │ 3.1306% difference 3.64773% difference 2.38063%difference 1.84845% difference 4.87375% difference 4.16953% difference 6 │ 3.12556% difference 3.34537% difference 2.99052%difference 2.60538% difference 2.14825% difference 1.95454% difference 7 │ 2.20615% difference 2.12861% difference 2.85282%difference 2.43336% difference 2.31389% difference 3.21563% difference 8 │ 1.9954% difference 3.61371% difference 3.35543%difference 3.49821% difference 3.41526% difference 8.25753% difference 9 │ 2.46845% difference 2.57784% difference 3.13067%difference 3.67681% difference 2.89139% difference 3.6067% difference 10 │ 3.60092% difference 2.16164% difference 3.9976% difference 2.6144% difference 4.27892% difference 2.68998% difference 11 │ 2.55454% difference 2.39073% difference 3.09631%difference 3.24292% difference 1.9107% difference 1.76182% difference 12 │ 13 │ 28.9089/1029.74729/10 27.888631/10 28.92125/10 30.85055/10 34.26854/10 14 │ 2.89089 2.974729 2.7888631 2.892125 3.085055 3.426854 ───┴─── ───┬─── │ Results: difference-dev-max-memory-unset VS master ───┼─── 1 │ 1MB allocation2MB allocation4MB allocation 6MB allocation8MB allocation10MB allocation 2 │ 3.96616% difference 3.05528% difference 0.563267% difference 1.12075% difference 3.52398% difference 3.25641% difference 3 │ 3.11387% difference 3.12499% difference 1.1133% difference 4.86997% difference 2.11481% difference 1.11668% difference 4 │ 3.14506% difference 2.06193% difference 3.36034%difference 2.80644% difference 2.37822% difference 3.07669% difference 5 │ 2.81052% difference 3.18499% difference 2.70705%difference 2.27847% difference 2.78506%
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On Mon, 2023-05-22 at 08:42 -0400, reid.thomp...@crunchydata.com wrote: > On Wed, 2023-05-17 at 23:07 -0400, reid.thomp...@crunchydata.com wrote: > > Thanks for the feedback. > > > > I'm plannning to look at this. > > > > Is your benchmark something that I could utilize? I.E. is it a set of > > scripts or a standard test from somewhere that I can duplicate? > > > > Thanks, > > Reid > > Attach patches updated to master. Pulled from patch 2 back to patch 1 a change that was also pertinent to patch 1. From e6f8499e0270f2291494260bc341e8ad1411c2ae Mon Sep 17 00:00:00 2001 From: Reid Thompson Date: Thu, 11 Aug 2022 12:01:25 -0400 Subject: [PATCH 1/2] Add tracking of backend memory allocated Add tracking of backend memory allocated in total and by allocation type (aset, dsm, generation, slab) by process. allocated_bytes tracks the current bytes of memory allocated to the backend process. aset_allocated_bytes, dsm_allocated_bytes, generation_allocated_bytes and slab_allocated_bytes track the allocation by type for the backend process. They are updated for the process as memory is malloc'd/freed. Memory allocated to items on the freelist is included. Dynamic shared memory allocations are included only in the value displayed for the backend that created them, they are not included in the value for backends that are attached to them to avoid double counting. DSM allocations that are not destroyed by the creating process prior to it's exit are considered long lived and are tracked in a global counter global_dsm_allocated_bytes. We limit the floor of allocation counters to zero. Created views pg_stat_global_memory_allocation and pg_stat_memory_allocation for access to these trackers. --- doc/src/sgml/monitoring.sgml| 246 src/backend/catalog/system_views.sql| 34 +++ src/backend/storage/ipc/dsm.c | 11 +- src/backend/storage/ipc/dsm_impl.c | 78 +++ src/backend/storage/lmgr/proc.c | 1 + src/backend/utils/activity/backend_status.c | 114 + src/backend/utils/adt/pgstatfuncs.c | 84 +++ src/backend/utils/init/miscinit.c | 3 + src/backend/utils/mmgr/aset.c | 17 ++ src/backend/utils/mmgr/generation.c | 15 ++ src/backend/utils/mmgr/slab.c | 22 ++ src/include/catalog/pg_proc.dat | 17 ++ src/include/storage/proc.h | 2 + src/include/utils/backend_status.h | 144 +++- src/test/regress/expected/rules.out | 27 +++ src/test/regress/expected/stats.out | 36 +++ src/test/regress/sql/stats.sql | 20 ++ 17 files changed, 869 insertions(+), 2 deletions(-) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index df5242fa80..cfc221fb2e 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -5757,6 +5757,252 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i + + pg_stat_memory_allocation + + + pg_stat_memory_allocation + + + + The pg_stat_memory_allocation view will have one + row per server process, showing information related to the current memory + allocation of that process in total and by allocator type. Due to the + dynamic nature of memory allocations the allocated bytes values may not be + exact but should be sufficient for the intended purposes. Dynamic shared + memory allocations are included only in the value displayed for the backend + that created them, they are not included in the value for backends that are + attached to them to avoid double counting. Use + pg_size_pretty described in +to make these values more easily + readable. + + + + pg_stat_memory_allocation View + + + + + Column Type + + + Description + + + + + + + + datid oid + + + OID of the database this backend is connected to + + + + + + pid integer + + + Process ID of this backend + + + + + + allocated_bytes bigint + + + Memory currently allocated to this backend in bytes. This is the balance + of bytes allocated and freed by this backend. Dynamic shared memory + allocations are included only in the value displayed for the backend that + created them, they are not included in the value for backends that are + attached to them to avoid double counting. + + + + + + aset_allocated_bytes bigint + + + Memory currently allocated to this backend in bytes via the allocation + set allocator. + + + + + + dsm_allocated_bytes bigint + + + Memory currently allocated to this backend in bytes via the dynamic + shared memory allocator. Upon process exit, dsm allocations that have +
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On Wed, 2023-05-17 at 23:07 -0400, reid.thomp...@crunchydata.com wrote: > Thanks for the feedback. > > I'm plannning to look at this. > > Is your benchmark something that I could utilize? I.E. is it a set of > scripts or a standard test from somewhere that I can duplicate? > > Thanks, > Reid > Hi Arne, Followup to the above. I experimented on my system regarding "The simple query select * from generate_series(0, 1000) shows roughly 18.9 % degradation on my test server." My laptop: 32GB ram 11th Gen Intel(R) Core(TM) i7-11850H 8 cores/16 threads @ 2.50GHz (Max Turbo Frequency. 4.80 GHz ; Cache. 24 MB) SSD -> Model: KXG60ZNV1T02 NVMe KIOXIA 1024GB (nvme) I updated to latest master and rebased my patch branches. I wrote a script to check out, build, install, init, and startup master, patch 1, patch 1+2, patch 1+2 as master, pg-stats-memory, dev-max-memory, and dev-max-memory-unset configured with ../../configure --silent --prefix=/home/rthompso/src/git/postgres/install/${dir} --with-openssl --with-tcl --with-tclconfig=/usr/lib/tcl8.6 --with-perl --with-libxml --with-libxslt --with-python --with-gssapi --with-systemd --with-ldap --enable-nls where $dir in master, pg-stats-memory, and dev-max-memory, dev-max-memory-unset. The only change made to the default postgresql.conf was to have the script add to the dev-max-memory instance the line "max_total_backend_memory = 2048" before startup. I did find one change in patch 2 that I pushed back into patch 1, this should only impact the pg-stats-memory instance. my .psqlrc turns timing on I created a script where I can pass two instances to be compared. It invokes psql -At -d postgres $connstr -P pager=off -c 'select * from generate_series(0, 1000)' 100 times on each of the 2 instances and calculates the AVG time and SD for the 100 runs. It then uses the AVG from each instance to calculate the percentage difference. Depending on the instance, my results differ from master from negligible to ~5.5%. Comparing master to itself had up to a ~2% variation. See below. 12 runs comparing dev-max-memory 2048 VS master Shows ~3% to 5.5% variation Calculate average runtime percentage difference between VER dev-max-memory 2048 and VER master 1307.14 -> VER dev-max-memory 2048 1240.74 -> VER master 5.21218% difference -- Calculate average runtime percentage difference between VER dev-max-memory 2048 and VER master 1315.99 -> VER dev-max-memory 2048 1245.64 -> VER master 5.4926% difference -- Calculate average runtime percentage difference between VER dev-max-memory 2048 and VER master 1317.39 -> VER dev-max-memory 2048 1265.33 -> VER master 4.03141% difference -- Calculate average runtime percentage difference between VER dev-max-memory 2048 and VER master 1313.52 -> VER dev-max-memory 2048 1256.69 -> VER master 4.42221% difference -- Calculate average runtime percentage difference between VER dev-max-memory 2048 and VER master 1329.98 -> VER dev-max-memory 2048 1253.75 -> VER master 5.90077% difference -- Calculate average runtime percentage difference between VER dev-max-memory 2048 and VER master 1314.47 -> VER dev-max-memory 2048 1245.6 -> VER master 5.38032% difference -- Calculate average runtime percentage difference between VER dev-max-memory 2048 and VER master 1309.7 -> VER dev-max-memory 2048 1258.55 -> VER master 3.98326% difference -- Calculate average runtime percentage difference between VER dev-max-memory 2048 and VER master 1322.16 -> VER dev-max-memory 2048 1248.94 -> VER master 5.69562% difference -- Calculate average runtime percentage difference between VER dev-max-memory 2048 and VER master 1320.15 -> VER dev-max-memory 2048 1261.41 -> VER master 4.55074% difference -- Calculate average runtime percentage difference between VER dev-max-memory 2048 and VER master 1345.22 -> VER dev-max-memory 2048 1280.96 -> VER master 4.8938% difference -- Calculate average runtime percentage difference between VER dev-max-memory 2048 and VER master 1296.03 -> VER dev-max-memory 2048 1257.06 -> VER master 3.05277% difference -- Calculate average runtime percentage difference between VER dev-max-memory 2048 and VER master 1319.5 -> VER dev-max-memory 2048 1252.34 -> VER master 5.22272% difference 12 showing dev-max-memory-unset VS master Shows ~2.5% to 5% variation Calculate average runtime percentage difference between VER dev-max-memory unset and VER master 1300.93 -> VER dev-max-memory unset 1235.12 -> VER master 5.18996% difference -- Calculate average runtime percentage difference between VER dev-max-memory unset and VER master 1293.57 -> VER dev-max-memory unset 1263.93 -> VER master 2.31789% difference -- Calculate average runtime percentage difference between VER dev-max-memory unset and VER master 1303.05 -> VER dev-max-memory unset 1258.11 -> VER master 3.50935% difference -- Calculate average runtime percentage difference between VER dev-max-memory unset
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On Wed, 2023-04-19 at 23:28 +, Arne Roland wrote: > > Thank you! I just tried our benchmark and got a performance > > degration > of around 28 %, which is way better than the last > > patch. > > > > The simple query select * from generate_series(0, 1000) shows > > > roughly 18.9 % degradation on my test server. > > > > By raising initial_allocation_allowance and > > > allocation_allowance_refill_qty I can get it to 16 % degradation. > > So > most of the degradation seems to be independent from raising > > the > allowance. > > > > I think we probably should investigate this further. > > > > Regards > > Arne > > Hi Arne, Thanks for the feedback. I'm plannning to look at this. Is your benchmark something that I could utilize? I.E. is it a set of scripts or a standard test from somewhere that I can duplicate? Thanks, Reid
Re: Add the ability to limit the amount of memory that can be allocated to backends.
Thank you! I just tried our benchmark and got a performance degration of around 28 %, which is way better than the last patch. The simple query select * from generate_series(0, 1000) shows roughly 18.9 % degradation on my test server. By raising initial_allocation_allowance and allocation_allowance_refill_qty I can get it to 16 % degradation. So most of the degradation seems to be independent from raising the allowance. I think we probably should investigate this further. Regards Arne
Re: Add the ability to limit the amount of memory that can be allocated to backends.
Updated patches attached. Rebased to current master. Added additional columns to pg_stat_global_memory_allocation to summarize backend allocations by type. Updated documentation. Corrected some issues noted in review by John Morris. Added code re EXEC_BACKEND for dev-max-memory branch. From 34514ae2bebe5e3ab2a0b5b680d3932b5e7706ee Mon Sep 17 00:00:00 2001 From: Reid Thompson Date: Sat, 4 Jun 2022 22:23:59 -0400 Subject: [PATCH 2/2] Add the ability to limit the amount of memory that can be allocated to backends. This builds on the work that adds backend memory allocated tracking. Add GUC variable max_total_backend_memory. Specifies a limit to the amount of memory (in MB) that may be allocated to backends in total (i.e. this is not a per user or per backend limit). If unset, or set to 0 it is disabled. It is intended as a resource to help avoid the OOM killer on LINUX and manage resources in general. A backend request that would exhaust max_total_backend_memory memory will be denied with an out of memory error causing that backend's current query/transaction to fail. Further requests will not be allocated until dropping below the limit. Keep this in mind when setting this value. Due to the dynamic nature of memory allocations, this limit is not exact. This limit does not affect auxiliary backend processes. Backend memory allocations are displayed in the pg_stat_memory_allocation and pg_stat_global_memory_allocation views. --- doc/src/sgml/config.sgml | 30 +++ doc/src/sgml/monitoring.sgml | 38 +++- src/backend/catalog/system_views.sql | 2 + src/backend/port/sysv_shmem.c | 9 + src/backend/postmaster/postmaster.c | 5 + src/backend/storage/ipc/dsm_impl.c| 18 ++ src/backend/storage/lmgr/proc.c | 45 + src/backend/utils/activity/backend_status.c | 183 ++ src/backend/utils/adt/pgstatfuncs.c | 16 +- src/backend/utils/hash/dynahash.c | 3 +- src/backend/utils/init/miscinit.c | 8 + src/backend/utils/misc/guc_tables.c | 11 ++ src/backend/utils/misc/postgresql.conf.sample | 3 + src/backend/utils/mmgr/aset.c | 33 src/backend/utils/mmgr/generation.c | 16 ++ src/backend/utils/mmgr/slab.c | 15 +- src/include/catalog/pg_proc.dat | 6 +- src/include/storage/proc.h| 7 + src/include/utils/backend_status.h| 120 ++-- src/test/regress/expected/rules.out | 4 +- 20 files changed, 537 insertions(+), 35 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index bcc49aec45..4c735e180f 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -2133,6 +2133,36 @@ include_dir 'conf.d' + + max_total_backend_memory (integer) + + max_total_backend_memory configuration parameter + + + + +Specifies a limit to the amount of memory (MB) that may be allocated to +backends in total (i.e. this is not a per user or per backend limit). +If unset, or set to 0 it is disabled. At databse startup +max_total_backend_memory is reduced by shared_memory_size_mb +(includes shared buffers and other memory required for initialization). +Each backend process is intialized with a 1MB local allowance which +also reduces max_total_bkend_mem_bytes_available. Keep this in mind +when setting this value. A backend request that would exhaust the limit +will be denied with an out of memory error causing that backend's +current query/transaction to fail. Further requests will not be +allocated until dropping below the limit. This limit does not affect +auxiliary backend processes + or the postmaster process. +Backend memory allocations (allocated_bytes) are +displayed in the +pg_stat_memory_allocation +view. Due to the dynamic nature of memory allocations, this limit is +not exact. + + + + diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 70b3441412..704a75bd6e 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -5704,10 +5704,7 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i Memory currently allocated to this backend in bytes. This is the balance - of bytes allocated and freed by this backend. Dynamic shared memory - allocations are included only in the value displayed for the backend that - created them, they are not included in the value for backends that are - attached to them to avoid double counting. + of bytes allocated and freed by this backend. @@ -5824,6 +5821,39 @@ SELECT pid, wait_event_type, wait_event
Re: Add the ability to limit the amount of memory that can be allocated to backends.
Updated patches attached. pg-stat-activity-backend-memory-allocated DSM allocations created by a process and not destroyed prior to it's exit are considered long lived and are tracked in global_dsm_allocated_bytes. created 2 new system views (see below): pg_stat_global_memory_allocation view displays datid, shared_memory_size, shared_memory_size_in_huge_pages, global_dsm_allocated_bytes. shared_memory_size and shared_memory_size_in_huge_pages display the calculated read only values for these GUCs. pg_stat_memory_allocation view Migrated allocated_bytes out of pg_stat_activity view into this view. pg_stat_memory_allocation also contains a breakdown of allocation by allocator type (aset, dsm, generation, slab). View displays datid, pid, allocated_bytes, aset_allocated_bytes, dsm_allocated_bytes, generation_allocated_bytes, slab_allocated_bytes by process. Reduced calls to initialize allocation counters by moving intialization call into InitPostmasterChild. postgres=# select * from pg_stat_global_memory_allocation; datid | shared_memory_size | shared_memory_size_in_huge_pages | global_dsm_allocated_bytes ---++--+ 5 | 192MB | 96 | 1048576 (1 row) postgres=# select * from pg_stat_memory_allocation; datid | pid | allocated_bytes | aset_allocated_bytes | dsm_allocated_bytes | generation_allocated_bytes | slab_allocated_bytes ---++-+--+-++-- | 981842 | 771512 | 771512 | 0 | 0 |0 | 981843 | 736696 | 736696 | 0 | 0 |0 5 | 981913 | 4274792 | 4274792 | 0 | 0 |0 | 981838 | 107216 | 107216 | 0 | 0 |0 | 981837 | 123600 | 123600 | 0 | 0 |0 | 981841 | 107216 | 107216 | 0 | 0 |0 (6 rows) postgres=# select ps.datid, ps.pid, state,application_name,backend_type, pa.* from pg_stat_activity ps join pg_stat_memory_allocation pa on (pa.pid = ps.pid) order by dsm_allocated_bytes, pa.pid; datid | pid | state | application_name | backend_type | datid | pid | allocated_bytes | aset_allocated_bytes | dsm_allocated_bytes | generation_allocated_bytes | slab_allocated_bytes ---+++--+--+---++-+--+-++-- | 981837 || | checkpointer | | 981837 | 123600 | 123600 | 0 | 0 |0 | 981838 || | background writer| | 981838 | 107216 | 107216 | 0 | 0 |0 | 981841 || | walwriter| | 981841 | 107216 | 107216 | 0 | 0 |0 | 981842 || | autovacuum launcher | | 981842 | 771512 | 771512 | 0 | 0 |0 | 981843 || | logical replication launcher | | 981843 | 736696 | 736696 | 0 | 0 |0 5 | 981913 | active | psql | client backend | 5 | 981913 | 5390864 | 5382824 | 0 | 8040 |0 (6 rows) dev-max-memory Include shared_memory_size in max_total_backend_memory calculations. max_total_backend_memory is reduced by shared_memory_size at startup. Local allowance is refilled when consumed from global max_total_bkend_mem_bytes_available. pg_stat_global_memory_allocation view add columns max_total_backend_memory_bytes, max_total_bkend_mem_bytes_available.
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On 2023-03-02 14:41:26 -0500, reid.thomp...@crunchydata.com wrote: > Patch has been rebased to master. Quite a few prior review comments seem to not have been addressed. There's not much point in posting new versions without that. I think there's zero chance 0002 can make it into 16. If 0001 is cleaned up, I can see a path.
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On Mon, 2023-02-13 at 16:26 -0800, Andres Freund wrote: > Hi, > > The tests recently started to fail: > > https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest%2F42%2F3867 > > I marked this as waiting on author. > > Greetings, > > Andres Freund Patch has been rebased to master. The memory limiting portion (patch 0002-*) has been refactored to utilize a shared counter for total memory allocation along with backend-local allowances that are initialized at process startup and refilled from the central counter upon being used up. Free'd memory is accumulated and returned to the shared counter upon meeting a threshold and/or upon process exit. At this point arbitrarily picked 1MB as the initial allowance and return threshold. Thanks, Reid From e044bacedab503d1cd732146e1b9947406191bb6 Mon Sep 17 00:00:00 2001 From: Reid Thompson Date: Sat, 4 Jun 2022 22:23:59 -0400 Subject: [PATCH 2/2] Add the ability to limit the amount of memory that can be allocated to backends. This builds on the work that adds backend memory allocated to pg_stat_activity. Add GUC variable max_total_backend_memory. Specifies a limit to the amount of memory (in MB) that may be allocated to backends in total (i.e. this is not a per user or per backend limit). If unset, or set to 0 it is disabled. It is intended as a resource to help avoid the OOM killer on LINUX and manage resources in general. A backend request that would push the total over the limit will be denied with an out of memory error causing that backend's current query/transaction to fail. Due to the dynamic nature of memory allocations, this limit is not exact. If within 1.5MB of the limit and two backends request 1MB each at the same time both may be allocated, and exceed the limit. Further requests will not be allocated until dropping below the limit. Keep this in mind when setting this value. This limit does not affect auxiliary backend processes. Backend memory allocations are displayed in the pg_stat_activity view. --- doc/src/sgml/config.sgml | 26 ++ src/backend/postmaster/autovacuum.c | 8 +- src/backend/postmaster/postmaster.c | 17 +- src/backend/postmaster/syslogger.c| 4 +- src/backend/storage/ipc/dsm_impl.c| 35 ++- src/backend/storage/lmgr/proc.c | 3 + src/backend/utils/activity/backend_status.c | 223 +- src/backend/utils/misc/guc_tables.c | 11 + src/backend/utils/misc/postgresql.conf.sample | 3 + src/backend/utils/mmgr/aset.c | 43 +++- src/backend/utils/mmgr/generation.c | 21 +- src/backend/utils/mmgr/slab.c | 21 +- src/include/storage/proc.h| 7 + src/include/utils/backend_status.h| 222 +++-- 14 files changed, 560 insertions(+), 84 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index e5c41cc6c6..1bff68b1ec 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -2113,6 +2113,32 @@ include_dir 'conf.d' + + max_total_backend_memory (integer) + + max_total_backend_memory configuration parameter + + + + +Specifies a limit to the amount of memory (MB) that may be allocated to +backends in total (i.e. this is not a per user or per backend limit). +If unset, or set to 0 it is disabled. A backend request that would +push the total over the limit will be denied with an out of memory +error causing that backend's current query/transaction to fail. Due to +the dynamic nature of memory allocations, this limit is not exact. If +within 1.5MB of the limit and two backends request 1MB each at the same +time both may be allocated, and exceed the limit. Further requests will +not be allocated until dropping below the limit. Keep this in mind when +setting this value. This limit does not affect auxiliary backend +processes . Backend memory +allocations (allocated_bytes) are displayed in the +pg_stat_activity +view. + + + + diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c index 59c9bcf8c4..ee03d08dd9 100644 --- a/src/backend/postmaster/autovacuum.c +++ b/src/backend/postmaster/autovacuum.c @@ -407,8 +407,8 @@ StartAutoVacLauncher(void) #ifndef EXEC_BACKEND case 0: - /* Zero allocated bytes to avoid double counting parent allocation */ - pgstat_zero_my_allocated_bytes(); + /* Init allocated bytes to avoid double counting parent allocation */ + pgstat_init_allocated_bytes(); /* in postmaster child ... */ InitPostmasterChild(); @@ -1488,8 +1488,8 @@ StartAutoVacWorker(void) #ifndef EXEC_BACKEND case 0: - /* Zero allocated bytes to avoid double counting parent allocation */ - pgstat_zero_my_allocated_bytes(); +
Re: Add the ability to limit the amount of memory that can be allocated to backends.
Hi, On 2023-01-26 15:27:20 -0500, Reid Thompson wrote: > Yes, just a rebase. There is still work to be done per earlier in the > thread. The tests recently started to fail: https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest%2F42%2F3867 > I do want to follow up and note re palloc/pfree vs malloc/free that the > tracking code (0001-Add-tracking-...) is not tracking palloc/pfree but is > explicitely tracking malloc/free. Not every palloc/pfree call executes the > tracking code, only those where the path followed includes malloc() or > free(). Routine palloc() calls fulfilled from the context's > freelist/emptyblocks/freeblock/etc and pfree() calls not invoking free() > avoid the tracking code. Sure, but we create a lot of memory contexts, so that's not a whole lot of comfort. I marked this as waiting on author. Greetings, Andres Freund
Re: Add the ability to limit the amount of memory that can be allocated to backends.
Regarding the shared counter noted here, > What you could do is to have a single, imprecise, shared counter for the total > memory allocation, and have a backend-local "allowance". When the allowance is > used up, refill it from the shared counter (a single atomic op). Is there a preferred or suggested location to put variables like this? Perhaps a current variable to use as a reference? Thanks, Reid
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On Mon, 2023-01-23 at 12:31 -0800, Andres Freund wrote: > Hi, > > I think it's basically still waiting on author, until the O(N) cost is gone > from the overflow limit check. > > Greetings, > > Andres Freund Yes, just a rebase. There is still work to be done per earlier in the thread. I do want to follow up and note re palloc/pfree vs malloc/free that the tracking code (0001-Add-tracking-...) is not tracking palloc/pfree but is explicitely tracking malloc/free. Not every palloc/pfree call executes the tracking code, only those where the path followed includes malloc() or free(). Routine palloc() calls fulfilled from the context's freelist/emptyblocks/freeblock/etc and pfree() calls not invoking free() avoid the tracking code. Thanks, Reid
Re: Add the ability to limit the amount of memory that can be allocated to backends.
Hi, On 2023-01-23 10:48:38 -0500, Reid Thompson wrote: > On Thu, 2023-01-19 at 16:50 +0530, vignesh C wrote: > > > > The patch does not apply on top of HEAD as in [1], please post a rebased > > patch: > > > > Regards, > > Vignesh > > rebased patch attached I think it's basically still waiting on author, until the O(N) cost is gone from the overflow limit check. Greetings, Andres Freund
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On Thu, 2023-01-19 at 16:50 +0530, vignesh C wrote: > > The patch does not apply on top of HEAD as in [1], please post a rebased > patch: > > Regards, > Vignesh rebased patch attached Thanks, Reid From b32a346d6e0e00c568e9a285ad15fc2703998c26 Mon Sep 17 00:00:00 2001 From: Reid Thompson Date: Sat, 4 Jun 2022 22:23:59 -0400 Subject: [PATCH 2/2] Add the ability to limit the amount of memory that can be allocated to backends. This builds on the work that adds backend memory allocated to pg_stat_activity. Add GUC variable max_total_backend_memory. Specifies a limit to the amount of memory (in MB) that may be allocated to backends in total (i.e. this is not a per user or per backend limit). If unset, or set to 0 it is disabled. It is intended as a resource to help avoid the OOM killer on LINUX and manage resources in general. A backend request that would push the total over the limit will be denied with an out of memory error causing that backend's current query/transaction to fail. Due to the dynamic nature of memory allocations, this limit is not exact. If within 1.5MB of the limit and two backends request 1MB each at the same time both may be allocated, and exceed the limit. Further requests will not be allocated until dropping below the limit. Keep this in mind when setting this value. This limit does not affect auxiliary backend processes. Backend memory allocations are displayed in the pg_stat_activity view. --- doc/src/sgml/config.sgml | 26 + src/backend/storage/ipc/dsm_impl.c| 12 ++ src/backend/utils/activity/backend_status.c | 108 ++ src/backend/utils/misc/guc_tables.c | 11 ++ src/backend/utils/misc/postgresql.conf.sample | 3 + src/backend/utils/mmgr/aset.c | 17 +++ src/backend/utils/mmgr/generation.c | 9 ++ src/backend/utils/mmgr/slab.c | 9 +- src/include/utils/backend_status.h| 3 + 9 files changed, 197 insertions(+), 1 deletion(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index f985afc009..51ed4623be 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -2113,6 +2113,32 @@ include_dir 'conf.d' + + max_total_backend_memory (integer) + + max_total_backend_memory configuration parameter + + + + +Specifies a limit to the amount of memory (MB) that may be allocated to +backends in total (i.e. this is not a per user or per backend limit). +If unset, or set to 0 it is disabled. A backend request that would +push the total over the limit will be denied with an out of memory +error causing that backend's current query/transaction to fail. Due to +the dynamic nature of memory allocations, this limit is not exact. If +within 1.5MB of the limit and two backends request 1MB each at the same +time both may be allocated, and exceed the limit. Further requests will +not be allocated until dropping below the limit. Keep this in mind when +setting this value. This limit does not affect auxiliary backend +processes . Backend memory +allocations (allocated_bytes) are displayed in the +pg_stat_activity +view. + + + + diff --git a/src/backend/storage/ipc/dsm_impl.c b/src/backend/storage/ipc/dsm_impl.c index 22885c7bd2..f7047107d5 100644 --- a/src/backend/storage/ipc/dsm_impl.c +++ b/src/backend/storage/ipc/dsm_impl.c @@ -254,6 +254,10 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size, return true; } + /* Do not exceed maximum allowed memory allocation */ + if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size)) + return false; + /* * Create new segment or open an existing one for attach. * @@ -525,6 +529,10 @@ dsm_impl_sysv(dsm_op op, dsm_handle handle, Size request_size, int flags = IPCProtection; size_t segsize; + /* Do not exceed maximum allowed memory allocation */ + if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size)) + return false; + /* * Allocate the memory BEFORE acquiring the resource, so that we don't * leak the resource if memory allocation fails. @@ -719,6 +727,10 @@ dsm_impl_windows(dsm_op op, dsm_handle handle, Size request_size, return true; } + /* Do not exceed maximum allowed memory allocation */ + if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size)) + return false; + /* Create new segment or open an existing one for attach. */ if (op == DSM_OP_CREATE) { diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c index 7baf2db57d..da2b5fb042 100644 --- a/src/backend/utils/activity/backend_status.c +++ b/src/backend/utils/activity/backend_status.c @@ -45,6 +45,9 @@ bool pgstat_track_activities = false; int pgstat_track_activity_query_size =
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On Fri, 6 Jan 2023 at 00:19, Reid Thompson wrote: > > On Tue, 2023-01-03 at 16:22 +0530, vignesh C wrote: > > > > The patch does not apply on top of HEAD as in [1], please post a > > rebased patch: > > ... > > Regards, > > Vignesh > > > > Attached is rebased patch, with some updates related to committed changes. The patch does not apply on top of HEAD as in [1], please post a rebased patch: === Applying patches on top of PostgreSQL commit ID 48880840f18cb75fcaecc77b5e7816b92c27157b === === applying patch ./0001-Add-tracking-of-backend-memory-allocated-to-pg_stat_.patch patching file src/test/regress/expected/rules.out Hunk #2 FAILED at 1875. Hunk #4 FAILED at 2090. 2 out of 4 hunks FAILED -- saving rejects to file src/test/regress/expected/rules.out.rej [1] - http://cfbot.cputube.org/patch_41_3867.log Regards, Vignesh
Re: Add the ability to limit the amount of memory that can be allocated to backends.
Hi, On 2023-01-13 09:15:10 -0500, Reid Thompson wrote: > On Mon, 2023-01-09 at 18:31 -0800, Andres Freund wrote: > > > Dynamic shared memory allocations are included only in the value displayed > > > for the backend that created them, they are not included in the value for > > > backends that are attached to them to avoid double counting. > > > > As mentioned before, I don't think accounting DSM this way makes sense. > > Understood, previously you noted 'There are a few uses of DSMs that track > shared resources, with the biggest likely being the stats for relations > etc'. I'd like to come up with a solution to address this; identifying the > long term allocations to shared state and accounting for them such that they > don't get 'lost' when the allocating backend exits. Any guidance or > direction would be appreciated. Tracking it as backend memory usage doesn't seem helpful to me, particularly because some of it is for server wide data tracking (pgstats, some caches). But that doesn't mean you couldn't track and report it separately. > > > +/* -- > > > + * pgstat_get_all_memory_allocated() - > > > + * > > > + * Return a uint64 representing the current shared memory > > > allocated to all > > > + * backends. This looks directly at the BackendStatusArray, and > > > so will > > > + * provide current information regardless of the age of our > > > transaction's > > > + * snapshot of the status array. > > > + * In the future we will likely utilize additional values - > > > perhaps limit > > > + * backend allocation by user/role, etc. > > > + * -- > > > > I think it's completely unfeasible to execute something as expensive as > > pgstat_get_all_backend_memory_allocated() on every allocation. Like, > > seriously, no. > > Ok. Do we check every nth allocation/try to implement a scheme of checking > more often as we we get closer to the declared max_total_bkend_mem? I think it's just not acceptable to do O(connections) work as part of something critical as memory allocation. Even if amortized imo. What you could do is to have a single, imprecise, shared counter for the total memory allocation, and have a backend-local "allowance". When the allowance is used up, refill it from the shared counter (a single atomic op). But honestly, I think we first need to have the accounting for a while before it makes sense to go for the memory limiting patch. And I doubt a single GUC will suffice to make this usable. > > And we absolutely definitely shouldn't just add CHECK_FOR_INTERRUPT() calls > > into the middle of allocator code. > > I'm open to guidance/suggestions/pointers to remedying these. Well, just don't have the CHECK_FOR_INTERRUPT(). Nor the O(N) operation. You also can't do the ereport(WARNING) there, that itself allocates memory, and could lead to recursion in some edge cases. Greetings, Andres Freund
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On Mon, 2023-01-09 at 18:31 -0800, Andres Freund wrote: > Hi, > > On 2023-01-05 13:44:20 -0500, Reid Thompson wrote: > > This new field displays the current bytes of memory allocated to the > > backend process. It is updated as memory for the process is > > malloc'd/free'd. Memory allocated to items on the freelist is included in > > the displayed value. > > It doesn't actually malloc/free. It tracks palloc/pfree. I will update the message > > > Dynamic shared memory allocations are included only in the value displayed > > for the backend that created them, they are not included in the value for > > backends that are attached to them to avoid double counting. > > As mentioned before, I don't think accounting DSM this way makes sense. Understood, previously you noted 'There are a few uses of DSMs that track shared resources, with the biggest likely being the stats for relations etc'. I'd like to come up with a solution to address this; identifying the long term allocations to shared state and accounting for them such that they don't get 'lost' when the allocating backend exits. Any guidance or direction would be appreciated. > > --- a/src/backend/postmaster/autovacuum.c > > +++ b/src/backend/postmaster/autovacuum.c > > @@ -407,6 +407,9 @@ StartAutoVacLauncher(void) > > > > #ifndef EXEC_BACKEND > > case 0: > > + /* Zero allocated bytes to avoid double counting parent > > allocation */ > > + pgstat_zero_my_allocated_bytes(); > > + > > /* in postmaster child ... */ > > InitPostmasterChild(); > > > > > @@ -1485,6 +1488,9 @@ StartAutoVacWorker(void) > > > > #ifndef EXEC_BACKEND > > case 0: > > + /* Zero allocated bytes to avoid double counting parent > > allocation */ > > + pgstat_zero_my_allocated_bytes(); > > + > > /* in postmaster child ... */ > > InitPostmasterChild(); > > > > diff --git a/src/backend/postmaster/postmaster.c > > b/src/backend/postmaster/postmaster.c > > index eac3450774..24278e5c18 100644 > > --- a/src/backend/postmaster/postmaster.c > > +++ b/src/backend/postmaster/postmaster.c > > @@ -4102,6 +4102,9 @@ BackendStartup(Port *port) > > { > > free(bn); > > > > + /* Zero allocated bytes to avoid double counting parent > > allocation */ > > + pgstat_zero_my_allocated_bytes(); > > + > > /* Detangle from postmaster */ > > InitPostmasterChild(); > > > It doesn't at all seem right to call pgstat_zero_my_allocated_bytes() here, > before even InitPostmasterChild() is called. Nor does it seem right to add the > call to so many places. > > Note that this is before we even delete postmaster's memory, see e.g.: > /* >* If the PostmasterContext is still around, recycle the space; we don't >* need it anymore after InitPostgres completes. Note this does not > trash >* *MyProcPort, because ConnCreate() allocated that space with malloc() >* ... else we'd need to copy the Port data first. Also, subsidiary > data >* such as the username isn't lost either; see ProcessStartupPacket(). >*/ > if (PostmasterContext) > { > MemoryContextDelete(PostmasterContext); > PostmasterContext = NULL; > } > > calling pgstat_zero_my_allocated_bytes() before we do this will lead to > undercounting memory usage, afaict. > OK - I'll trace back through these and see if I can better locate and reduce the number of invocations. > > +/* Enum helper for reporting memory allocated bytes */ > > +enum allocation_direction > > +{ > > + PG_ALLOC_DECREASE = -1, > > + PG_ALLOC_IGNORE, > > + PG_ALLOC_INCREASE, > > +}; > > What's the point of this? > > > > +/* -- > > + * pgstat_report_allocated_bytes() - > > + * > > + * Called to report change in memory allocated for this backend. > > + * > > + * my_allocated_bytes initially points to local memory, making it safe to > > call > > + * this before pgstats has been initialized. allocation_direction is a > > + * positive/negative multiplier enum defined above. > > + * -- > > + */ > > +static inline void > > +pgstat_report_allocated_bytes(int64 allocated_bytes, int > > allocation_direction) > > I don't think this should take allocation_direction as a parameter, I'd make > it two different functions. Originally it was two functions, a suggestion was made in the thread to maybe consolidate them to a single function with a direction indicator, hence the above. I'm fine with converting it back to separate functions. > > > + if (allocation_direction == PG_ALLOC_DECREASE && > > + pg_sub_u64_overflow(*my_allocated_bytes, allocated_bytes, > > )) > > + { > > + *my_allocated_bytes = 0; > > + ereport(LOG, > > + errmsg("Backend %d deallocated %lld bytes,
Re: Add the ability to limit the amount of memory that can be allocated to backends.
Hi, On 2023-01-05 13:44:20 -0500, Reid Thompson wrote: > From 0a6b152e0559a25033bd7d43eb0959432e0d Mon Sep 17 00:00:00 2001 > From: Reid Thompson > Date: Thu, 11 Aug 2022 12:01:25 -0400 > Subject: [PATCH 1/2] Add tracking of backend memory allocated to > pg_stat_activity > > This new field displays the current bytes of memory allocated to the > backend process. It is updated as memory for the process is > malloc'd/free'd. Memory allocated to items on the freelist is included in > the displayed value. It doesn't actually malloc/free. It tracks palloc/pfree. > Dynamic shared memory allocations are included only in the value displayed > for the backend that created them, they are not included in the value for > backends that are attached to them to avoid double counting. As mentioned before, I don't think accounting DSM this way makes sense. > --- a/src/backend/postmaster/autovacuum.c > +++ b/src/backend/postmaster/autovacuum.c > @@ -407,6 +407,9 @@ StartAutoVacLauncher(void) > > #ifndef EXEC_BACKEND > case 0: > + /* Zero allocated bytes to avoid double counting parent > allocation */ > + pgstat_zero_my_allocated_bytes(); > + > /* in postmaster child ... */ > InitPostmasterChild(); > @@ -1485,6 +1488,9 @@ StartAutoVacWorker(void) > > #ifndef EXEC_BACKEND > case 0: > + /* Zero allocated bytes to avoid double counting parent > allocation */ > + pgstat_zero_my_allocated_bytes(); > + > /* in postmaster child ... */ > InitPostmasterChild(); > > diff --git a/src/backend/postmaster/postmaster.c > b/src/backend/postmaster/postmaster.c > index eac3450774..24278e5c18 100644 > --- a/src/backend/postmaster/postmaster.c > +++ b/src/backend/postmaster/postmaster.c > @@ -4102,6 +4102,9 @@ BackendStartup(Port *port) > { > free(bn); > > + /* Zero allocated bytes to avoid double counting parent > allocation */ > + pgstat_zero_my_allocated_bytes(); > + > /* Detangle from postmaster */ > InitPostmasterChild(); It doesn't at all seem right to call pgstat_zero_my_allocated_bytes() here, before even InitPostmasterChild() is called. Nor does it seem right to add the call to so many places. Note that this is before we even delete postmaster's memory, see e.g.: /* * If the PostmasterContext is still around, recycle the space; we don't * need it anymore after InitPostgres completes. Note this does not trash * *MyProcPort, because ConnCreate() allocated that space with malloc() * ... else we'd need to copy the Port data first. Also, subsidiary data * such as the username isn't lost either; see ProcessStartupPacket(). */ if (PostmasterContext) { MemoryContextDelete(PostmasterContext); PostmasterContext = NULL; } calling pgstat_zero_my_allocated_bytes() before we do this will lead to undercounting memory usage, afaict. > +/* Enum helper for reporting memory allocated bytes */ > +enum allocation_direction > +{ > + PG_ALLOC_DECREASE = -1, > + PG_ALLOC_IGNORE, > + PG_ALLOC_INCREASE, > +}; What's the point of this? > +/* -- > + * pgstat_report_allocated_bytes() - > + * > + * Called to report change in memory allocated for this backend. > + * > + * my_allocated_bytes initially points to local memory, making it safe to > call > + * this before pgstats has been initialized. allocation_direction is a > + * positive/negative multiplier enum defined above. > + * -- > + */ > +static inline void > +pgstat_report_allocated_bytes(int64 allocated_bytes, int > allocation_direction) I don't think this should take allocation_direction as a parameter, I'd make it two different functions. > +{ > + uint64 temp; > + > + /* > + * Avoid *my_allocated_bytes unsigned integer overflow on > + * PG_ALLOC_DECREASE > + */ > + if (allocation_direction == PG_ALLOC_DECREASE && > + pg_sub_u64_overflow(*my_allocated_bytes, allocated_bytes, > )) > + { > + *my_allocated_bytes = 0; > + ereport(LOG, > + errmsg("Backend %d deallocated %lld bytes, > exceeding the %llu bytes it is currently reporting allocated. Setting > reported to 0.", > +MyProcPid, (long long) > allocated_bytes, > +(unsigned long long) > *my_allocated_bytes)); We certainly shouldn't have an ereport in here. This stuff really needs to be cheap. > + } > + else > + *my_allocated_bytes += (allocated_bytes) * allocation_direction; Superfluous parens? > +/* -- > + * pgstat_get_all_memory_allocated() - > + * > + * Return a uint64
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On Tue, 2023-01-03 at 16:22 +0530, vignesh C wrote: > > The patch does not apply on top of HEAD as in [1], please post a > rebased patch: > ... > Regards, > Vignesh > Attached is rebased patch, with some updates related to committed changes. Thanks, Reid -- Reid Thompson Senior Software Engineer Crunchy Data, Inc. reid.thomp...@crunchydata.com www.crunchydata.com From 69516942b71d5d41850fbc00b971db7476c7a01a Mon Sep 17 00:00:00 2001 From: Reid Thompson Date: Sat, 4 Jun 2022 22:23:59 -0400 Subject: [PATCH 2/2] Add the ability to limit the amount of memory that can be allocated to backends. This builds on the work that adds backend memory allocated to pg_stat_activity. Add GUC variable max_total_backend_memory. Specifies a limit to the amount of memory (in MB) that may be allocated to backends in total (i.e. this is not a per user or per backend limit). If unset, or set to 0 it is disabled. It is intended as a resource to help avoid the OOM killer on LINUX and manage resources in general. A backend request that would push the total over the limit will be denied with an out of memory error causing that backend's current query/transaction to fail. Due to the dynamic nature of memory allocations, this limit is not exact. If within 1.5MB of the limit and two backends request 1MB each at the same time both may be allocated, and exceed the limit. Further requests will not be allocated until dropping below the limit. Keep this in mind when setting this value. This limit does not affect auxiliary backend processes. Backend memory allocations are displayed in the pg_stat_activity view. --- doc/src/sgml/config.sgml | 26 + src/backend/storage/ipc/dsm_impl.c| 12 ++ src/backend/utils/activity/backend_status.c | 108 ++ src/backend/utils/misc/guc_tables.c | 11 ++ src/backend/utils/misc/postgresql.conf.sample | 3 + src/backend/utils/mmgr/aset.c | 17 +++ src/backend/utils/mmgr/generation.c | 9 ++ src/backend/utils/mmgr/slab.c | 9 +- src/include/utils/backend_status.h| 3 + 9 files changed, 197 insertions(+), 1 deletion(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 05b3862d09..0362f26451 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -2079,6 +2079,32 @@ include_dir 'conf.d' + + max_total_backend_memory (integer) + + max_total_backend_memory configuration parameter + + + + +Specifies a limit to the amount of memory (MB) that may be allocated to +backends in total (i.e. this is not a per user or per backend limit). +If unset, or set to 0 it is disabled. A backend request that would +push the total over the limit will be denied with an out of memory +error causing that backend's current query/transaction to fail. Due to +the dynamic nature of memory allocations, this limit is not exact. If +within 1.5MB of the limit and two backends request 1MB each at the same +time both may be allocated, and exceed the limit. Further requests will +not be allocated until dropping below the limit. Keep this in mind when +setting this value. This limit does not affect auxiliary backend +processes . Backend memory +allocations (allocated_bytes) are displayed in the +pg_stat_activity +view. + + + + diff --git a/src/backend/storage/ipc/dsm_impl.c b/src/backend/storage/ipc/dsm_impl.c index 22885c7bd2..f7047107d5 100644 --- a/src/backend/storage/ipc/dsm_impl.c +++ b/src/backend/storage/ipc/dsm_impl.c @@ -254,6 +254,10 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size, return true; } + /* Do not exceed maximum allowed memory allocation */ + if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size)) + return false; + /* * Create new segment or open an existing one for attach. * @@ -525,6 +529,10 @@ dsm_impl_sysv(dsm_op op, dsm_handle handle, Size request_size, int flags = IPCProtection; size_t segsize; + /* Do not exceed maximum allowed memory allocation */ + if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size)) + return false; + /* * Allocate the memory BEFORE acquiring the resource, so that we don't * leak the resource if memory allocation fails. @@ -719,6 +727,10 @@ dsm_impl_windows(dsm_op op, dsm_handle handle, Size request_size, return true; } + /* Do not exceed maximum allowed memory allocation */ + if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size)) + return false; + /* Create new segment or open an existing one for attach. */ if (op == DSM_OP_CREATE) { diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c index 7baf2db57d..da2b5fb042 100644 ---
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On Fri, 9 Dec 2022 at 20:41, Reid Thompson wrote: > > On Tue, 2022-12-06 at 10:32 -0800, Andres Freund wrote: > > Hi, > > > > On 2022-11-26 22:22:15 -0500, Reid Thompson wrote: > > > rebased/patched to current master && current pg-stat-activity- > > > backend-memory-allocated > > > > This version fails to build with msvc, and builds with warnings on > > other > > platforms. > > https://cirrus-ci.com/build/5410696721072128 > > msvc: > > > > Andres Freund > > updated patches The patch does not apply on top of HEAD as in [1], please post a rebased patch: === Applying patches on top of PostgreSQL commit ID 92957ed98c5c565362ce665266132a7f08f6b0c0 === === applying patch ./0001-Add-tracking-of-backend-memory-allocated-to-pg_stat_.patch ... patching file src/backend/utils/mmgr/slab.c Hunk #1 succeeded at 69 (offset 16 lines). Hunk #2 succeeded at 414 (offset 175 lines). Hunk #3 succeeded at 436 with fuzz 2 (offset 176 lines). Hunk #4 FAILED at 286. Hunk #5 succeeded at 488 (offset 186 lines). Hunk #6 FAILED at 381. Hunk #7 FAILED at 554. 3 out of 7 hunks FAILED -- saving rejects to file src/backend/utils/mmgr/slab.c.rej [1] - http://cfbot.cputube.org/patch_41_3867.log Regards, Vignesh
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On Tue, 2022-12-06 at 10:32 -0800, Andres Freund wrote: > Hi, > > On 2022-11-26 22:22:15 -0500, Reid Thompson wrote: > > rebased/patched to current master && current pg-stat-activity- > > backend-memory-allocated > > This version fails to build with msvc, and builds with warnings on > other > platforms. > https://cirrus-ci.com/build/5410696721072128 > msvc: > > Andres Freund updated patches -- Reid Thompson Senior Software Engineer Crunchy Data, Inc. reid.thomp...@crunchydata.com www.crunchydata.com From e48292eaf402bfe397f1c2fdc1b3efd8cd0a9137 Mon Sep 17 00:00:00 2001 From: Reid Thompson Date: Sat, 4 Jun 2022 22:23:59 -0400 Subject: [PATCH 2/2] Add the ability to limit the amount of memory that can be allocated to backends. This builds on the work that adds backend memory allocated to pg_stat_activity. Add GUC variable max_total_backend_memory. Specifies a limit to the amount of memory (in MB) that may be allocated to backends in total (i.e. this is not a per user or per backend limit). If unset, or set to 0 it is disabled. It is intended as a resource to help avoid the OOM killer on LINUX and manage resources in general. A backend request that would push the total over the limit will be denied with an out of memory error causing that backend's current query/transaction to fail. Due to the dynamic nature of memory allocations, this limit is not exact. If within 1.5MB of the limit and two backends request 1MB each at the same time both may be allocated, and exceed the limit. Further requests will not be allocated until dropping below the limit. Keep this in mind when setting this value. This limit does not affect auxiliary backend processes. Backend memory allocations are displayed in the pg_stat_activity view. --- doc/src/sgml/config.sgml | 26 + src/backend/storage/ipc/dsm_impl.c| 12 ++ src/backend/utils/activity/backend_status.c | 108 ++ src/backend/utils/misc/guc_tables.c | 11 ++ src/backend/utils/misc/postgresql.conf.sample | 3 + src/backend/utils/mmgr/aset.c | 17 +++ src/backend/utils/mmgr/generation.c | 9 ++ src/backend/utils/mmgr/slab.c | 8 ++ src/include/utils/backend_status.h| 3 + 9 files changed, 197 insertions(+) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index ff6fcd902a..2899f109ac 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -2079,6 +2079,32 @@ include_dir 'conf.d' + + max_total_backend_memory (integer) + + max_total_backend_memory configuration parameter + + + + +Specifies a limit to the amount of memory (MB) that may be allocated to +backends in total (i.e. this is not a per user or per backend limit). +If unset, or set to 0 it is disabled. A backend request that would +push the total over the limit will be denied with an out of memory +error causing that backend's current query/transaction to fail. Due to +the dynamic nature of memory allocations, this limit is not exact. If +within 1.5MB of the limit and two backends request 1MB each at the same +time both may be allocated, and exceed the limit. Further requests will +not be allocated until dropping below the limit. Keep this in mind when +setting this value. This limit does not affect auxiliary backend +processes . Backend memory +allocations (allocated_bytes) are displayed in the +pg_stat_activity +view. + + + + diff --git a/src/backend/storage/ipc/dsm_impl.c b/src/backend/storage/ipc/dsm_impl.c index 36ef3e425e..58fb690c69 100644 --- a/src/backend/storage/ipc/dsm_impl.c +++ b/src/backend/storage/ipc/dsm_impl.c @@ -254,6 +254,10 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size, return true; } + /* Do not exceed maximum allowed memory allocation */ + if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size)) + return false; + /* * Create new segment or open an existing one for attach. * @@ -525,6 +529,10 @@ dsm_impl_sysv(dsm_op op, dsm_handle handle, Size request_size, int flags = IPCProtection; size_t segsize; + /* Do not exceed maximum allowed memory allocation */ + if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size)) + return false; + /* * Allocate the memory BEFORE acquiring the resource, so that we don't * leak the resource if memory allocation fails. @@ -719,6 +727,10 @@ dsm_impl_windows(dsm_op op, dsm_handle handle, Size request_size, return true; } + /* Do not exceed maximum allowed memory allocation */ + if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size)) + return false; + /* Create new segment or open an existing one for attach. */ if (op == DSM_OP_CREATE) { diff --git
Re: Add the ability to limit the amount of memory that can be allocated to backends.
Hi, On 2022-11-26 22:22:15 -0500, Reid Thompson wrote: > rebased/patched to current master && current > pg-stat-activity-backend-memory-allocated This version fails to build with msvc, and builds with warnings on other platforms. https://cirrus-ci.com/build/5410696721072128 msvc: [20:26:51.286] c:\cirrus\src\include\utils/backend_status.h(40): error C2059: syntax error: 'constant' mingw cross: [20:26:26.358] from /usr/share/mingw-w64/include/winsock2.h:23, [20:26:26.358] from ../../src/include/port/win32_port.h:60, [20:26:26.358] from ../../src/include/port.h:24, [20:26:26.358] from ../../src/include/c.h:1306, [20:26:26.358] from ../../src/include/postgres.h:47, [20:26:26.358] from controldata_utils.c:18: [20:26:26.358] ../../src/include/utils/backend_status.h:40:2: error: expected identifier before numeric constant [20:26:26.358]40 | IGNORE, [20:26:26.358] | ^~ [20:26:26.358] In file included from ../../src/include/postgres.h:48, [20:26:26.358] from controldata_utils.c:18: [20:26:26.358] ../../src/include/utils/backend_status.h: In function ‘pgstat_report_allocated_bytes’: [20:26:26.358] ../../src/include/utils/backend_status.h:365:12: error: format ‘%ld’ expects argument of type ‘long int’, but argument 3 has type ‘uint64’ {aka ‘long long unsigned int’} [-Werror=format=] [20:26:26.358] 365 | errmsg("Backend %d deallocated %ld bytes, exceeding the %ld bytes it is currently reporting allocated. Setting reported to 0.", [20:26:26.358] | ^~~ [20:26:26.358] 366 | MyProcPid, allocated_bytes, *my_allocated_bytes)); [20:26:26.358] |~~~ [20:26:26.358] || [20:26:26.358] |uint64 {aka long long unsigned int} Due to windows having long be 32bit, you need to use %lld. Our custom to deal with that is to cast the argument to errmsg as long long unsigned and use %llu. Btw, given that the argument is uint64, it doesn't seem correct to use %ld, that's signed. Not that it's going to matter, but ... Greetings, Andres Freund
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On Thu, 2022-11-03 at 11:48 -0400, Reid Thompson wrote: > On Tue, 2022-10-25 at 11:49 -0400, Reid Thompson wrote: > > Rebased to current. Add a couple changes per conversation with D > Christensen (include units in field name, group field with > backend_xid > and backend_xmin fields in pg_stat_activity view, rather than between > query_id and query) > rebased/patched to current master && current pg-stat-activity-backend-memory-allocated -- Reid Thompson Senior Software Engineer Crunchy Data, Inc. reid.thomp...@crunchydata.com www.crunchydata.com From 1470f45e086bef0757cc262d10e08904e46b9a88 Mon Sep 17 00:00:00 2001 From: Reid Thompson Date: Sat, 4 Jun 2022 22:23:59 -0400 Subject: [PATCH 2/2] Add the ability to limit the amount of memory that can be allocated to backends. This builds on the work that adds backend memory allocated to pg_stat_activity. Add GUC variable max_total_backend_memory. Specifies a limit to the amount of memory (in MB) that may be allocated to backends in total (i.e. this is not a per user or per backend limit). If unset, or set to 0 it is disabled. It is intended as a resource to help avoid the OOM killer on LINUX and manage resources in general. A backend request that would push the total over the limit will be denied with an out of memory error causing that backend's current query/transaction to fail. Due to the dynamic nature of memory allocations, this limit is not exact. If within 1.5MB of the limit and two backends request 1MB each at the same time both may be allocated, and exceed the limit. Further requests will not be allocated until dropping below the limit. Keep this in mind when setting this value. This limit does not affect auxiliary backend processes. Backend memory allocations are displayed in the pg_stat_activity view. --- doc/src/sgml/config.sgml | 26 + src/backend/storage/ipc/dsm_impl.c| 12 ++ src/backend/utils/activity/backend_status.c | 108 ++ src/backend/utils/misc/guc_tables.c | 11 ++ src/backend/utils/misc/postgresql.conf.sample | 3 + src/backend/utils/mmgr/aset.c | 17 +++ src/backend/utils/mmgr/generation.c | 9 ++ src/backend/utils/mmgr/slab.c | 8 ++ src/include/utils/backend_status.h| 3 + 9 files changed, 197 insertions(+) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 24b1624bad..c2db3ace7a 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -2079,6 +2079,32 @@ include_dir 'conf.d' + + max_total_backend_memory (integer) + + max_total_backend_memory configuration parameter + + + + +Specifies a limit to the amount of memory (MB) that may be allocated to +backends in total (i.e. this is not a per user or per backend limit). +If unset, or set to 0 it is disabled. A backend request that would +push the total over the limit will be denied with an out of memory +error causing that backend's current query/transaction to fail. Due to +the dynamic nature of memory allocations, this limit is not exact. If +within 1.5MB of the limit and two backends request 1MB each at the same +time both may be allocated, and exceed the limit. Further requests will +not be allocated until dropping below the limit. Keep this in mind when +setting this value. This limit does not affect auxiliary backend +processes . Backend memory +allocations (allocated_bytes) are displayed in the +pg_stat_activity +view. + + + + diff --git a/src/backend/storage/ipc/dsm_impl.c b/src/backend/storage/ipc/dsm_impl.c index 65d59fc43e..8d9df676af 100644 --- a/src/backend/storage/ipc/dsm_impl.c +++ b/src/backend/storage/ipc/dsm_impl.c @@ -254,6 +254,10 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size, return true; } + /* Do not exceed maximum allowed memory allocation */ + if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size)) + return false; + /* * Create new segment or open an existing one for attach. * @@ -525,6 +529,10 @@ dsm_impl_sysv(dsm_op op, dsm_handle handle, Size request_size, int flags = IPCProtection; size_t segsize; + /* Do not exceed maximum allowed memory allocation */ + if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size)) + return false; + /* * Allocate the memory BEFORE acquiring the resource, so that we don't * leak the resource if memory allocation fails. @@ -719,6 +727,10 @@ dsm_impl_windows(dsm_op op, dsm_handle handle, Size request_size, return true; } + /* Do not exceed maximum allowed memory allocation */ + if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size)) + return false; + /* Create new segment or open an existing one for attach. */ if (op == DSM_OP_CREATE)
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On Tue, 2022-10-25 at 11:49 -0400, Reid Thompson wrote: > Hi Arne, > > On Mon, 2022-10-24 at 15:27 +, Arne Roland wrote: > > Hello Reid, > > > > could you rebase the patch again? It doesn't apply currently > > (http://cfbot.cputube.org/patch_40_3867.log). Thanks! > > rebased patches attached. Rebased to current. Add a couple changes per conversation with D Christensen (include units in field name, group field with backend_xid and backend_xmin fields in pg_stat_activity view, rather than between query_id and query) -- Reid Thompson Senior Software Engineer Crunchy Data, Inc. reid.thomp...@crunchydata.com www.crunchydata.com From 9cf35c79be107feedb63f6f674ac9d2347d1875e Mon Sep 17 00:00:00 2001 From: Reid Thompson Date: Sat, 4 Jun 2022 22:23:59 -0400 Subject: [PATCH 2/2] Add the ability to limit the amount of memory that can be allocated to backends. This builds on the work that adds backend memory allocated to pg_stat_activity. Add GUC variable max_total_backend_memory. Specifies a limit to the amount of memory (in MB) that may be allocated to backends in total (i.e. this is not a per user or per backend limit). If unset, or set to 0 it is disabled. It is intended as a resource to help avoid the OOM killer on LINUX and manage resources in general. A backend request that would push the total over the limit will be denied with an out of memory error causing that backend's current query/transaction to fail. Due to the dynamic nature of memory allocations, this limit is not exact. If within 1.5MB of the limit and two backends request 1MB each at the same time both may be allocated, and exceed the limit. Further requests will not be allocated until dropping below the limit. Keep this in mind when setting this value. This limit does not affect auxiliary backend processes. Backend memory allocations are displayed in the pg_stat_activity view. --- doc/src/sgml/config.sgml | 26 src/backend/storage/ipc/dsm_impl.c| 12 ++ src/backend/utils/activity/backend_status.c | 111 +- src/backend/utils/misc/guc_tables.c | 11 ++ src/backend/utils/misc/postgresql.conf.sample | 3 + src/backend/utils/mmgr/aset.c | 17 +++ src/backend/utils/mmgr/generation.c | 9 ++ src/backend/utils/mmgr/slab.c | 8 ++ src/include/utils/backend_status.h| 2 + 9 files changed, 198 insertions(+), 1 deletion(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 559eb898a9..5762999fa5 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -2079,6 +2079,32 @@ include_dir 'conf.d' + + max_total_backend_memory (integer) + + max_total_backend_memory configuration parameter + + + + +Specifies a limit to the amount of memory (MB) that may be allocated to +backends in total (i.e. this is not a per user or per backend limit). +If unset, or set to 0 it is disabled. A backend request that would +push the total over the limit will be denied with an out of memory +error causing that backend's current query/transaction to fail. Due to +the dynamic nature of memory allocations, this limit is not exact. If +within 1.5MB of the limit and two backends request 1MB each at the same +time both may be allocated, and exceed the limit. Further requests will +not be allocated until dropping below the limit. Keep this in mind when +setting this value. This limit does not affect auxiliary backend +processes . Backend memory +allocations (backend_allocated_bytes) are displayed in +the pg_stat_activity +view. + + + + diff --git a/src/backend/storage/ipc/dsm_impl.c b/src/backend/storage/ipc/dsm_impl.c index eae03159c3..aaf74e9486 100644 --- a/src/backend/storage/ipc/dsm_impl.c +++ b/src/backend/storage/ipc/dsm_impl.c @@ -254,6 +254,10 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size, return true; } + /* Do not exceed maximum allowed memory allocation */ + if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size)) + return false; + /* * Create new segment or open an existing one for attach. * @@ -525,6 +529,10 @@ dsm_impl_sysv(dsm_op op, dsm_handle handle, Size request_size, int flags = IPCProtection; size_t segsize; + /* Do not exceed maximum allowed memory allocation */ + if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size)) + return false; + /* * Allocate the memory BEFORE acquiring the resource, so that we don't * leak the resource if memory allocation fails. @@ -719,6 +727,10 @@ dsm_impl_windows(dsm_op op, dsm_handle handle, Size request_size, return true; } + /* Do not exceed maximum allowed memory allocation */ + if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size)) +
Re: Add the ability to limit the amount of memory that can be allocated to backends.
Hi Arne, On Mon, 2022-10-24 at 15:27 +, Arne Roland wrote: > Hello Reid, > > could you rebase the patch again? It doesn't apply currently > (http://cfbot.cputube.org/patch_40_3867.log). Thanks! rebased patches attached. > You mention, that you want to prevent the compiler from getting > cute.I don't think this comments are exactly helpful in the current > state. I think probably fine to just omit them. I attempted to follow previous convention when adding code and these comments have been consistently applied throughout backend_status.c where a volatile pointer is being used. > I don't understand the purpose of the result variable in > exceeds_max_total_bkend_mem. What purpose does it serve? > > I really like the simplicity of the suggestion here to prevent oom. If max_total_backend_memory is configured, exceeds_max_total_bkend_mem() will return true if an allocation request will push total backend memory allocated over the configured value. exceeds_max_total_bkend_mem() is implemented in the various allocators along the lines of ...snip... /* Do not exceed maximum allowed memory allocation */ if (exceeds_max_total_bkend_mem('new request size')) return NULL; ...snip... Do not allocate the memory requested, return NULL instead. PG already had code in place to handle NULL returns from allocation requests. The allocation code in aset.c, slab.c, generation.c, dsm_impl.c utilizes exceeds_max_total_bkend_mem() max_total_backend_memory (integer) Specifies a limit to the amount of memory (MB) that may be allocated to backends in total (i.e. this is not a per user or per backend limit). If unset, or set to 0 it is disabled. A backend request that would push the total over the limit will be denied with an out of memory error causing that backend's current query/transaction to fail. Due to the dynamic nature of memory allocations, this limit is not exact. If within 1.5MB of the limit and two backends request 1MB each at the same time both may be allocated, and exceed the limit. Further requests will not be allocated until dropping below the limit. Keep this in mind when setting this value. This limit does not affect auxiliary backend processes Auxiliary process . Backend memory allocations (backend_mem_allocated) are displayed in the pg_stat_activity view. > I intent to play around with a lot of backends, once I get a rebased > patch. > > Regards > Arne From ab654a48ec7bfbc3bc377c5757a04f1756e72e79 Mon Sep 17 00:00:00 2001 From: Reid Thompson Date: Thu, 11 Aug 2022 12:01:25 -0400 Subject: [PATCH 1/2] Add tracking of backend memory allocated to pg_stat_activity This new field displays the current bytes of memory allocated to the backend process. It is updated as memory for the process is malloc'd/free'd. Memory allocated to items on the freelist is included in the displayed value. Dynamic shared memory allocations are included only in the value displayed for the backend that created them, they are not included in the value for backends that are attached to them to avoid double counting. On occasion, orphaned memory segments may be cleaned up on postmaster startup. This may result in decreasing the sum without a prior increment. We limit the floor of backend_mem_allocated to zero. Updated pg_stat_activity documentation for the new column. --- doc/src/sgml/monitoring.sgml| 12 +++ src/backend/catalog/system_views.sql| 1 + src/backend/storage/ipc/dsm_impl.c | 80 +++ src/backend/utils/activity/backend_status.c | 105 src/backend/utils/adt/pgstatfuncs.c | 4 +- src/backend/utils/mmgr/aset.c | 18 src/backend/utils/mmgr/generation.c | 15 +++ src/backend/utils/mmgr/slab.c | 21 src/include/catalog/pg_proc.dat | 6 +- src/include/utils/backend_status.h | 7 +- src/test/regress/expected/rules.out | 9 +- 11 files changed, 269 insertions(+), 9 deletions(-) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index e5d622d514..4983bbc814 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -947,6 +947,18 @@ postgres 27093 0.0 0.0 30096 2752 ?Ss 11:34 0:00 postgres: ser + + + backend_mem_allocated bigint + + + The byte count of memory allocated to this backend. Dynamic shared memory + allocations are included only in the value displayed for the backend that + created them, they are not included in the value for backends that are + attached to them to avoid double counting. + + + query text diff --git a/src/backend/catalog/system_views.sql
Re: Add the ability to limit the amount of memory that can be allocated to backends.
Hello Reid, could you rebase the patch again? It doesn't apply currently (http://cfbot.cputube.org/patch_40_3867.log). Thanks! You mention, that you want to prevent the compiler from getting cute. I don't think this comments are exactly helpful in the current state. I think probably fine to just omit them. I don't understand the purpose of the result variable in exceeds_max_total_bkend_mem. What purpose does it serve? I really like the simplicity of the suggestion here to prevent oom. I intent to play around with a lot of backends, once I get a rebased patch. Regards Arne From: Reid Thompson Sent: Thursday, September 15, 2022 4:58:19 PM To: Ibrar Ahmed; pgsql-hackers@lists.postgresql.org Cc: reid.thomp...@crunchydata.com; Justin Pryzby Subject: Re: Add the ability to limit the amount of memory that can be allocated to backends. On Thu, 2022-09-15 at 12:07 +0400, Ibrar Ahmed wrote: > > The patch does not apply; please rebase the patch. > > patching file src/backend/utils/misc/guc.c > Hunk #1 FAILED at 3664. > 1 out of 1 hunk FAILED -- saving rejects to file > src/backend/utils/misc/guc.c.rej > > patching file src/backend/utils/misc/postgresql.conf.sample > rebased patches attached. Thanks, Reid
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On Thu, 2022-09-15 at 12:07 +0400, Ibrar Ahmed wrote: > > The patch does not apply; please rebase the patch. > > patching file src/backend/utils/misc/guc.c > Hunk #1 FAILED at 3664. > 1 out of 1 hunk FAILED -- saving rejects to file > src/backend/utils/misc/guc.c.rej > > patching file src/backend/utils/misc/postgresql.conf.sample > rebased patches attached. Thanks, Reid From 0e7010c53508d5a396edd16fd9166abe431f5dbe Mon Sep 17 00:00:00 2001 From: Reid Thompson Date: Thu, 11 Aug 2022 12:01:25 -0400 Subject: [PATCH 1/2] Add tracking of backend memory allocated to pg_stat_activity This new field displays the current bytes of memory allocated to the backend process. It is updated as memory for the process is malloc'd/free'd. Memory allocated to items on the freelist is included in the displayed value. Dynamic shared memory allocations are included only in the value displayed for the backend that created them, they are not included in the value for backends that are attached to them to avoid double counting. On occasion, orphaned memory segments may be cleaned up on postmaster startup. This may result in decreasing the sum without a prior increment. We limit the floor of backend_mem_allocated to zero. Updated pg_stat_activity documentation for the new column. --- doc/src/sgml/monitoring.sgml| 12 +++ src/backend/catalog/system_views.sql| 1 + src/backend/storage/ipc/dsm_impl.c | 80 +++ src/backend/utils/activity/backend_status.c | 105 src/backend/utils/adt/pgstatfuncs.c | 4 +- src/backend/utils/mmgr/aset.c | 18 src/backend/utils/mmgr/generation.c | 15 +++ src/backend/utils/mmgr/slab.c | 21 src/include/catalog/pg_proc.dat | 6 +- src/include/utils/backend_status.h | 7 +- src/test/regress/expected/rules.out | 9 +- 11 files changed, 269 insertions(+), 9 deletions(-) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 1d9509a2f6..40ae638f25 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -947,6 +947,18 @@ postgres 27093 0.0 0.0 30096 2752 ?Ss 11:34 0:00 postgres: ser + + + backend_mem_allocated bigint + + + The byte count of memory allocated to this backend. Dynamic shared memory + allocations are included only in the value displayed for the backend that + created them, they are not included in the value for backends that are + attached to them to avoid double counting. + + + query text diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 55f7ec79e0..a78750ab12 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -862,6 +862,7 @@ CREATE VIEW pg_stat_activity AS S.backend_xid, s.backend_xmin, S.query_id, +S.backend_mem_allocated, S.query, S.backend_type FROM pg_stat_get_activity(NULL) AS S diff --git a/src/backend/storage/ipc/dsm_impl.c b/src/backend/storage/ipc/dsm_impl.c index e1b90c5de4..3356bb65b5 100644 --- a/src/backend/storage/ipc/dsm_impl.c +++ b/src/backend/storage/ipc/dsm_impl.c @@ -66,6 +66,7 @@ #include "postmaster/postmaster.h" #include "storage/dsm_impl.h" #include "storage/fd.h" +#include "utils/backend_status.h" #include "utils/guc.h" #include "utils/memutils.h" @@ -232,6 +233,13 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size, name))); return false; } + /* + * Detach and destroy pass through here, only decrease the memory + * shown allocated in pg_stat_activity when the creator destroys the + * allocation. + */ + if (op == DSM_OP_DESTROY) + pgstat_report_backend_mem_allocated_decrease(*mapped_size); *mapped_address = NULL; *mapped_size = 0; if (op == DSM_OP_DESTROY && shm_unlink(name) != 0) @@ -332,6 +340,36 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size, name))); return false; } + + /* + * Attach and create pass through here, only update backend memory + * allocated in pg_stat_activity for the creator process. + */ + if (op == DSM_OP_CREATE) + { + /* + * Posix creation calls dsm_impl_posix_resize implying that resizing + * occurs or may be added in the future. As implemented + * dsm_impl_posix_resize utilizes fallocate or truncate, passing the + * whole new size as input, growing the allocation as needed (only + * truncate supports shrinking). We update by replacing the old + * allocation with the new. + */ +#if defined(HAVE_POSIX_FALLOCATE) && defined(__linux__) + /* + * posix_fallocate does not shrink allocations, adjust only on + * allocation increase. + */ + if (request_size > *mapped_size) + { + pgstat_report_backend_mem_allocated_decrease(*mapped_size); +
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On Mon, Sep 12, 2022 at 8:30 PM Reid Thompson wrote: > On Fri, 2022-09-09 at 12:14 -0500, Justin Pryzby wrote: > > On Sat, Sep 03, 2022 at 11:40:03PM -0400, Reid Thompson wrote: > > > > > + 0, 0, INT_MAX, > > > > > + NULL, NULL, NULL > > > > I think this needs a maximum like INT_MAX/1024/1024 > > > > > > Is this noting that we'd set a ceiling of 2048MB? > > > > The reason is that you're later multiplying it by 1024*1024, so you > > need > > to limit it to avoid overflowing. Compare with > > min_dynamic_shared_memory, Log_RotationSize, maintenance_work_mem, > > autovacuum_work_mem. > > What I originally attempted to implement is: > GUC "max_total_backend_memory" max value as INT_MAX = 2147483647 MB > (2251799812636672 bytes). And the other variables and comparisons as > bytes represented as uint64 to avoid overflow. > > Is this invalid? > > > typo: Explicitely > > corrected > > > +errmsg("request will exceed postgresql.conf > > defined max_total_backend_memory limit (%lu > %lu)", > > > > I wouldn't mention postgresql.conf - it could be in > > postgresql.auto.conf, or an include file, or a -c parameter. > > Suggest: allocation would exceed max_total_backend_memory limit... > > > > updated > > > > > + ereport(LOG, errmsg("decrease reduces reported > > backend memory allocated below zero; setting reported to 0")); > > > > Suggest: deallocation would decrease backend memory below zero; > > updated > > > + {"max_total_backend_memory", PGC_SIGHUP, > > RESOURCES_MEM, > > > > > > > > Should this be PGC_SU_BACKEND to allow a superuser to set a higher > > limit (or no limit)? > > Sounds good to me. I'll update to that. > Would PGC_SUSET be too open? > > > There's compilation warning under mingw cross compile due to > > sizeof(long). See d914eb347 and other recent commits which I guess > > is > > the current way to handle this. > > http://cfbot.cputube.org/reid-thompson.html > > updated %lu to %llu and changed cast from uint64 to > unsigned long long in the ereport call > > > For performance test, you'd want to check what happens with a large > > number of max_connections (and maybe a large number of clients). TPS > > isn't the only thing that matters. For example, a utility command > > might > > sometimes do a lot of allocations (or deallocations), or a > > "parameterized nested loop" may loop over over many outer tuples and > > reset for each. There's also a lot of places that reset to a > > "per-tuple" context. I started looking at its performance, but > > nothing > > to show yet. > > Thanks > > > Would you keep people copied on your replies ("reply all") ? > > Otherwise > > I (at least) may miss them. I think that's what's typical on these > > lists (and the list tool is smart enough not to send duplicates to > > people who are direct recipients). > > Ok - will do, thanks. > > -- > Reid Thompson > Senior Software Engineer > Crunchy Data, Inc. > > reid.thomp...@crunchydata.com > www.crunchydata.com > > > The patch does not apply; please rebase the patch. patching file src/backend/utils/misc/guc.c Hunk #1 FAILED at 3664. 1 out of 1 hunk FAILED -- saving rejects to file src/backend/utils/misc/guc.c.rej patching file src/backend/utils/misc/postgresql.conf.sample -- Ibrar Ahmed
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On Fri, 2022-09-09 at 12:14 -0500, Justin Pryzby wrote: > On Sat, Sep 03, 2022 at 11:40:03PM -0400, Reid Thompson wrote: > > > > + 0, 0, INT_MAX, > > > > + NULL, NULL, NULL > > > I think this needs a maximum like INT_MAX/1024/1024 > > > > Is this noting that we'd set a ceiling of 2048MB? > > The reason is that you're later multiplying it by 1024*1024, so you > need > to limit it to avoid overflowing. Compare with > min_dynamic_shared_memory, Log_RotationSize, maintenance_work_mem, > autovacuum_work_mem. What I originally attempted to implement is: GUC "max_total_backend_memory" max value as INT_MAX = 2147483647 MB (2251799812636672 bytes). And the other variables and comparisons as bytes represented as uint64 to avoid overflow. Is this invalid? > typo: Explicitely corrected > + errmsg("request will exceed postgresql.conf > defined max_total_backend_memory limit (%lu > %lu)", > > I wouldn't mention postgresql.conf - it could be in > postgresql.auto.conf, or an include file, or a -c parameter. > Suggest: allocation would exceed max_total_backend_memory limit... > updated > > + ereport(LOG, errmsg("decrease reduces reported > backend memory allocated below zero; setting reported to 0")); > > Suggest: deallocation would decrease backend memory below zero; updated > + {"max_total_backend_memory", PGC_SIGHUP, > RESOURCES_MEM, > > > > Should this be PGC_SU_BACKEND to allow a superuser to set a higher > limit (or no limit)? Sounds good to me. I'll update to that. Would PGC_SUSET be too open? > There's compilation warning under mingw cross compile due to > sizeof(long). See d914eb347 and other recent commits which I guess > is > the current way to handle this. > http://cfbot.cputube.org/reid-thompson.html updated %lu to %llu and changed cast from uint64 to unsigned long long in the ereport call > For performance test, you'd want to check what happens with a large > number of max_connections (and maybe a large number of clients). TPS > isn't the only thing that matters. For example, a utility command > might > sometimes do a lot of allocations (or deallocations), or a > "parameterized nested loop" may loop over over many outer tuples and > reset for each. There's also a lot of places that reset to a > "per-tuple" context. I started looking at its performance, but > nothing > to show yet. Thanks > Would you keep people copied on your replies ("reply all") ? > Otherwise > I (at least) may miss them. I think that's what's typical on these > lists (and the list tool is smart enough not to send duplicates to > people who are direct recipients). Ok - will do, thanks. -- Reid Thompson Senior Software Engineer Crunchy Data, Inc. reid.thomp...@crunchydata.com www.crunchydata.com From 584a04f1b53948049e73165a4ffdd544c950ab0d Mon Sep 17 00:00:00 2001 From: Reid Thompson Date: Thu, 11 Aug 2022 12:01:25 -0400 Subject: [PATCH 1/2] Add tracking of backend memory allocated to pg_stat_activity This new field displays the current bytes of memory allocated to the backend process. It is updated as memory for the process is malloc'd/free'd. Memory allocated to items on the freelist is included in the displayed value. Dynamic shared memory allocations are included only in the value displayed for the backend that created them, they are not included in the value for backends that are attached to them to avoid double counting. On occasion, orphaned memory segments may be cleaned up on postmaster startup. This may result in decreasing the sum without a prior increment. We limit the floor of backend_mem_allocated to zero. Updated pg_stat_activity documentation for the new column. --- doc/src/sgml/monitoring.sgml| 12 +++ src/backend/catalog/system_views.sql| 1 + src/backend/storage/ipc/dsm_impl.c | 80 +++ src/backend/utils/activity/backend_status.c | 105 src/backend/utils/adt/pgstatfuncs.c | 4 +- src/backend/utils/mmgr/aset.c | 18 src/backend/utils/mmgr/generation.c | 15 +++ src/backend/utils/mmgr/slab.c | 21 src/include/catalog/pg_proc.dat | 6 +- src/include/utils/backend_status.h | 7 +- src/test/regress/expected/rules.out | 9 +- 11 files changed, 269 insertions(+), 9 deletions(-) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 1d9509a2f6..40ae638f25 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -947,6 +947,18 @@ postgres 27093 0.0 0.0 30096 2752 ?Ss 11:34 0:00 postgres: ser + + + backend_mem_allocated bigint + + + The byte count of memory allocated to this backend.
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On Sat, Sep 03, 2022 at 11:40:03PM -0400, Reid Thompson wrote: > > > + 0, 0, INT_MAX, > > > + NULL, NULL, NULL > > I think this needs a maximum like INT_MAX/1024/1024 > > Is this noting that we'd set a ceiling of 2048MB? The reason is that you're later multiplying it by 1024*1024, so you need to limit it to avoid overflowing. Compare with min_dynamic_shared_memory, Log_RotationSize, maintenance_work_mem, autovacuum_work_mem. typo: Explicitely +errmsg("request will exceed postgresql.conf defined max_total_backend_memory limit (%lu > %lu)", I wouldn't mention postgresql.conf - it could be in postgresql.auto.conf, or an include file, or a -c parameter. Suggest: allocation would exceed max_total_backend_memory limit... + ereport(LOG, errmsg("decrease reduces reported backend memory allocated below zero; setting reported to 0")); Suggest: deallocation would decrease backend memory below zero; + {"max_total_backend_memory", PGC_SIGHUP, RESOURCES_MEM, Should this be PGC_SU_BACKEND to allow a superuser to set a higher limit (or no limit)? There's compilation warning under mingw cross compile due to sizeof(long). See d914eb347 and other recent commits which I guess is the current way to handle this. http://cfbot.cputube.org/reid-thompson.html For performance test, you'd want to check what happens with a large number of max_connections (and maybe a large number of clients). TPS isn't the only thing that matters. For example, a utility command might sometimes do a lot of allocations (or deallocations), or a "parameterized nested loop" may loop over over many outer tuples and reset for each. There's also a lot of places that reset to a "per-tuple" context. I started looking at its performance, but nothing to show yet. Would you keep people copied on your replies ("reply all") ? Otherwise I (at least) may miss them. I think that's what's typical on these lists (and the list tool is smart enough not to send duplicates to people who are direct recipients). -- Justin
Re: Add the ability to limit the amount of memory that can be allocated to backends.
Greetings, * David Rowley (dgrowle...@gmail.com) wrote: > On Thu, 1 Sept 2022 at 04:52, Reid Thompson > wrote: > > Add the ability to limit the amount of memory that can be allocated to > > backends. > > Are you aware that relcache entries are stored in backend local memory > and that once we've added a relcache entry for a relation that we have > no current code which attempts to reduce the memory consumption used > by cache entries when there's memory pressure? Short answer to this is yes, and that's an issue, but it isn't this patch's problem to deal with- that's an issue that the relcache system needs to be changed to address. > It seems to me that if we had this feature as you propose that a > backend could hit the limit and stay there just from the memory > requirements of the relation cache after some number of tables have > been accessed from the given backend. It's not hard to imagine a > situation where the palloc() would start to fail during parse, which > might make it quite infuriating for anyone trying to do something > like: Agreed that this could happen but I don't imagine it to be super likely- and even if it does, this is probably a better position to be in as the backend could then be disconnected from and would then go away and its memory free'd, unlike the current OOM-killer situation where we crash and go through recovery. We should note this in the documentation though, sure, so that administrators understand how this can occur and can take action to address it. > I think a better solution to this problem would be to have "memory > grants", where we configure some amount of "pool" memory that backends > are allowed to use for queries. The planner would have to add the > expected number of work_mem that the given query is expected to use > and before that query starts, the executor would have to "checkout" > that amount of memory from the pool and return it when finished. If > there is not enough memory in the pool then the query would have to > wait until enough memory is available. This creates a deadlocking > hazard that the deadlock detector would need to be made aware of. Sure, that also sounds great and a query acceptance system would be wonderful. If someone is working on that with an expectation of it landing before v16, great. Otherwise, I don't see it as relevant to the question about if we should include this feature or not, and I'm not even sure that we'd refuse this feature even if we already had an acceptance system as a stop-gap should we guess wrong and not realize it until it's too late. Thanks, Stephen signature.asc Description: PGP signature
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On Fri, 2022-09-02 at 09:30 +0200, Drouvot, Bertrand wrote: > Hi, > > I'm not sure we are choosing the right victims here (aka the ones > that are doing the request that will push the total over the limit). > > Imagine an extreme case where a single backend consumes say 99% of > the limit, shouldn't it be the one to be "punished"? (and somehow forced > to give the memory back). > > The problem that i see with the current approach is that a "bad" > backend could impact all the others and continue to do so. > > what about punishing say the highest consumer , what do you think? > (just speaking about the general idea here, not about the implementation) Initially, we believe that punishing the detector is reasonable if we can help administrators avoid the OOM killer/resource starvation. But we can and should expand on this idea. Another thought is, rather than just failing the query/transaction we have the affected backend do a clean exit, freeing all it's resources. -- Reid Thompson Senior Software Engineer Crunchy Data, Inc. reid.thomp...@crunchydata.com www.crunchydata.com
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On Thu, 2022-09-01 at 11:48 +0900, Kyotaro Horiguchi wrote: > > > > The patch seems to limit both of memory-context allocations and DSM > > allocations happen on a specific process by the same budget. In the > > fist place I don't think it's sensible to cap the amount of DSM > > allocations by per-process budget. > > > > DSM is used by pgstats subsystem. There can be cases where pgstat > > complains for denial of DSM allocation after the budget has been > > exhausted by memory-context allocations, or every command complains > > for denial of memory-context allocation after once the per-process > > budget is exhausted by DSM allocations. That doesn't seem > > reasonable. > > regards. It's intended as a mechanism for administrators to limit total postgresql memory consumption to avoid the OOM killer causing a crash and restart, or to ensure that resources are available for other processes on shared hosts, etc. It limits all types of allocations in order to accomplish this. Our documentation will note this, so that administrators that have the need to set it are aware that it can affect all non-auxiliary processes and what the effect is.
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On Wed, 2022-08-31 at 12:34 -0500, Justin Pryzby wrote: > You should name the patches with different prefixes, like > 001,002,003 Otherwise, cfbot may try to apply them in the wrong > order. > git format-patch is the usual tool for that. Thanks for the pointer. My experience with git in the past has been minimal and basic. > > + Specifies a limit to the amount of memory (MB) that may be > > allocated to > > MB are just the default unit, right ? > The user should be allowed to write max_total_backend_memory='2GB' Correct. Default units are MB. Other unit types are converted to MB. > > + causing that backends current query/transaction to fail. > > backend's > > + allocated exceeding the limit. Further requests will not > > allocated, and exceed the limit > > > + if (MyAuxProcType != NotAnAuxProcess) > The double negative is confusing, so could use a comment. > > + elog(WARNING, > I think it should be ereport() rather than elog(), which is > internal-only, and not-translated. Corrected/added the the above items. Attached patches with the corrections. > > + 0, 0, INT_MAX, > > + NULL, NULL, NULL > I think this needs a maximum like INT_MAX/1024/1024 Is this noting that we'd set a ceiling of 2048MB? > > + for (i = 1; i <= NumBackendStatSlots; i++) > > + { > > It's looping over every backend for each allocation. > Do you know if there's any performance impact of that ? I'm not very familiar with how to test performance impact, I'm open to suggestions. I have performed the below pgbench tests and noted the basic tps differences in the table. Test 1: branch master CFLAGS="-I/usr/include/python3.8/ " /home/rthompso/src/git/postgres/configure --silent --prefix=/home/rthompso/src/git/postgres/install/master --with-openssl --with-tcl --with-tclconfig=/usr/lib/tcl8.6 --with-perl --with-libxml --with-libxslt --with-python --with-gssapi --with-systemd --with-ldap --enable-nls make -s -j12 && make -s install initdb default postgresql.conf settings init pgbench pgbench -U rthompso -p 5433 -h localhost -i -s 50 testpgbench 10 iterations for ctr in {1..10}; do { time pgbench -p 5433 -h localhost -c 10 -j 10 -t 5 testpgbench; } 2>&1 | tee -a pgstatsResultsNoLimitSet; done Test 2: branch pg-stat-activity-backend-memory-allocated CFLAGS="-I/usr/include/python3.8/ " /home/rthompso/src/git/postgres/configure --silent --prefix=/home/rthompso/src/git/postgres/install/pg-stats-memory/ --with-openssl --with-tcl --with-tclconfig=/usr/lib/tcl8.6 --with-perl --with-libxml --with-libxslt --with-python --with-gssapi --with-systemd --with-ldap --enable-nls make -s -j12 && make -s install initdb default postgresql.conf settings init pgbench pgbench -U rthompso -p 5433 -h localhost -i -s 50 testpgbench 10 iterations for ctr in {1..10}; do { time pgbench -p 5433 -h localhost -c 10 -j 10 -t 5 testpgbench; } 2>&1 | tee -a pgstatsResultsPg-stats-memory; done Test 3: branch dev-max-memory CFLAGS="-I/usr/include/python3.8/ " /home/rthompso/src/git/postgres/configure --silent --prefix=/home/rthompso/src/git/postgres/install/dev-max-memory/ --with-openssl --with-tcl --with-tclconfig=/usr/lib/tcl8.6 --with-perl --with-libxml --with-libxslt --with-python --with-gssapi --with-systemd --with-ldap --enable-nls make -s -j12 && make -s install initdb default postgresql.conf settings init pgbench pgbench -U rthompso -p 5433 -h localhost -i -s 50 testpgbench 10 iterations for ctr in {1..10}; do { time pgbench -p 5433 -h localhost -c 10 -j 10 -t 5 testpgbench; } 2>&1 | tee -a pgstatsResultsDev-max-memory; done Test 4: branch dev-max-memory CFLAGS="-I/usr/include/python3.8/ " /home/rthompso/src/git/postgres/configure --silent --prefix=/home/rthompso/src/git/postgres/install/dev-max-memory/ --with-openssl --with-tcl --with-tclconfig=/usr/lib/tcl8.6 --with-perl --with-libxml --with-libxslt --with-python --with-gssapi --with-systemd --with-ldap --enable-nls make -s -j12 && make -s install initdb non-default postgresql.conf setting for max_total_backend_memory = 100MB init pgbench pgbench -U rthompso -p 5433 -h localhost -i -s 50 testpgbench 10 iterations for ctr in {1..10}; do { time pgbench -p 5433 -h localhost -c 10 -j 10 -t 5 testpgbench; } 2>&1 | tee -a pgstatsResultsDev-max-memory100MB; done Laptop 11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz 8 Cores 16 threads 32GB RAM SSD drive Averages from the 10 runs and tps difference over the 10 runs |--+--++---+--+---+---+--| | Test Run | Master | Track Memory Allocated | Diff from Master | Max Mem off | Diff from Master | Max Mem 100MB | Diff from Master | | Set 1| Test 1 | Test 2 | | Test 3 | | Test 4|
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On Thu, 1 Sept 2022 at 04:52, Reid Thompson wrote: > Add the ability to limit the amount of memory that can be allocated to > backends. Are you aware that relcache entries are stored in backend local memory and that once we've added a relcache entry for a relation that we have no current code which attempts to reduce the memory consumption used by cache entries when there's memory pressure? It seems to me that if we had this feature as you propose that a backend could hit the limit and stay there just from the memory requirements of the relation cache after some number of tables have been accessed from the given backend. It's not hard to imagine a situation where the palloc() would start to fail during parse, which might make it quite infuriating for anyone trying to do something like: SET max_total_backend_memory TO 0; or ALTER SYSTEM SET max_total_backend_memory TO 0; I think a better solution to this problem would be to have "memory grants", where we configure some amount of "pool" memory that backends are allowed to use for queries. The planner would have to add the expected number of work_mem that the given query is expected to use and before that query starts, the executor would have to "checkout" that amount of memory from the pool and return it when finished. If there is not enough memory in the pool then the query would have to wait until enough memory is available. This creates a deadlocking hazard that the deadlock detector would need to be made aware of. I know Thomas Munro has mentioned this "memory grant" or "memory pool" feature to me previously and I think he even has some work in progress code for it. It's a very tricky problem, however, as aside from the deadlocking issue, it requires working out how much memory a given plan will use concurrently. That's not as simple as counting the nodes that use work_mem and summing those up. There is some discussion about the feature in [1]. I was unable to find what Thomas mentioned on the list about this. I've included him here in case he has any extra information to share. David [1] https://www.postgresql.org/message-id/flat/20220713222342.GE18011%40telsasoft.com#b4f526aa8f2c893567c1ecf069f9e6c7
Re: Add the ability to limit the amount of memory that can be allocated to backends.
At Wed, 31 Aug 2022 12:50:19 -0400, Reid Thompson wrote in > Hi Hackers, > > Add the ability to limit the amount of memory that can be allocated to > backends. The patch seems to limit both of memory-context allocations and DSM allocations happen on a specific process by the same budget. In the fist place I don't think it's sensible to cap the amount of DSM allocations by per-process budget. DSM is used by pgstats subsystem. There can be cases where pgstat complains for denial of DSM allocation after the budget has been exhausted by memory-context allocations, or every command complains for denial of memory-context allocation after once the per-process budget is exhausted by DSM allocations. That doesn't seem reasonable. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Add the ability to limit the amount of memory that can be allocated to backends.
On Wed, Aug 31, 2022 at 12:50:19PM -0400, Reid Thompson wrote: > Hi Hackers, > > Add the ability to limit the amount of memory that can be allocated to > backends. > > This builds on the work that adds backend memory allocated to > pg_stat_activity > https://www.postgresql.org/message-id/67bb5c15c0489cb499723b0340f16e10c22485ec.camel%40crunchydata.com > Both patches are attached. You should name the patches with different prefixes, like 001,002,003 Otherwise, cfbot may try to apply them in the wrong order. git format-patch is the usual tool for that. > +Specifies a limit to the amount of memory (MB) that may be allocated > to MB are just the default unit, right ? The user should be allowed to write max_total_backend_memory='2GB' > +backends in total (i.e. this is not a per user or per backend limit). > +If unset, or set to 0 it is disabled. A backend request that would > push > +the total over the limit will be denied with an out of memory error > +causing that backends current query/transaction to fail. Due to the > dynamic backend's > +nature of memory allocations, this limit is not exact. If within > 1.5MB of > +the limit and two backends request 1MB each at the same time both > may be > +allocated exceeding the limit. Further requests will not be > allocated until allocated, and exceed the limit > +bool > +exceeds_max_total_bkend_mem(uint64 allocation_request) > +{ > + bool result = false; > + > + if (MyAuxProcType != NotAnAuxProcess) > + return result; The double negative is confusing, so could use a comment. > + /* Convert max_total_bkend_mem to bytes for comparison */ > + if (max_total_bkend_mem && > + pgstat_get_all_backend_memory_allocated() + > + allocation_request > (uint64)max_total_bkend_mem * 1024 * 1024) > + { > + /* > + * Explicitely identify the OOM being a result of this > + * configuration parameter vs a system failure to allocate OOM. > + */ > + elog(WARNING, > + "request will exceed postgresql.conf defined > max_total_backend_memory limit (%lu > %lu)", > + pgstat_get_all_backend_memory_allocated() + > + allocation_request, (uint64)max_total_bkend_mem * 1024 > * 1024); I think it should be ereport() rather than elog(), which is internal-only, and not-translated. > + {"max_total_backend_memory", PGC_SIGHUP, RESOURCES_MEM, > + gettext_noop("Restrict total backend memory > allocations to this max."), > + gettext_noop("0 turns this feature off."), > + GUC_UNIT_MB > + }, > + _total_bkend_mem, > + 0, 0, INT_MAX, > + NULL, NULL, NULL I think this needs a maximum like INT_MAX/1024/1024 > +uint64 > +pgstat_get_all_backend_memory_allocated(void) > +{ ... > + for (i = 1; i <= NumBackendStatSlots; i++) > + { It's looping over every backend for each allocation. Do you know if there's any performance impact of that ? I think it may be necessary to track the current allocation size in shared memory (with atomic increments?). Maybe decrements would need to be exactly accounted for, or otherwise Assert() that the value is not negative. I don't know how expensive it'd be to have conditionals for each decrement, but maybe the value would only be decremented at strategic times, like at transaction commit or backend shutdown. -- Justin