subject:"Re\: Add the ability to limit the amount of memory that can be allocated to backends."

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2024-03-15 Thread Alexander Lakhin


Hello Anton,

14.03.2024 23:36, Anton A. Melnikov wrote:

On 13.03.2024 10:41, Anton A. Melnikov wrote:


Here is a version updated for the current master.



During patch updating i mistakenly added double counting of deallocatated 
blocks.
That's why the tests in the patch tester failed.
Fixed it and squashed fix 0002 with 0001.
Here is fixed version.


Please try the following with the patches applied:
echo "shared_buffers = '1MB'
max_total_backend_memory = '10MB'" > /tmp/extra.config

CPPFLAGS="-Og" ./configure --enable-tap-tests --enable-debug --enable-cassert 
...
TEMP_CONFIG=/tmp/extra.config make check

It fails for me as follows:
...
# postmaster did not respond within 60 seconds, examine 
".../src/test/regress/log/postmaster.log" for the reason
...
src/test/regress/log/postmaster.log contains:
...
TRAP: failed Assert("ret != NULL"), File: "mcxt.c", Line: 1327, PID: 4109270
TRAP: failed Assert("ret != NULL"), File: "mcxt.c", Line: 1327, PID: 4109271
postgres: autovacuum launcher (ExceptionalCondition+0x69)[0x55ce441fcc6e]
postgres: autovacuum launcher (palloc0+0x0)[0x55ce4422eb67]
postgres: logical replication launcher 
(ExceptionalCondition+0x69)[0x55ce441fcc6e]
postgres: autovacuum launcher (InitDeadLockChecking+0xa6)[0x55ce4408a6f0]
postgres: logical replication launcher (palloc0+0x0)[0x55ce4422eb67]
postgres: logical replication launcher 
(InitDeadLockChecking+0x45)[0x55ce4408a68f]
postgres: autovacuum launcher (InitProcess+0x600)[0x55ce4409c6f2]
postgres: logical replication launcher (InitProcess+0x600)[0x55ce4409c6f2]
postgres: autovacuum launcher (+0x44b4e2)[0x55ce43ff24e2]
...
grep TRAP src/test/regress/log/postmaster.log | wc -l
445

Best regards,
Alexander

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2024-03-14 Thread Anton A. Melnikov


On 13.03.2024 10:41, Anton A. Melnikov wrote:


Here is a version updated for the current master.



During patch updating i mistakenly added double counting of deallocatated 
blocks.
That's why the tests in the patch tester failed.
Fixed it and squashed fix 0002 with 0001.
Here is fixed version.

With the best wishes!

--
Anton A. Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres CompanyFrom 7a0925a38acfb9a945087318a5d91fae4680db0e Mon Sep 17 00:00:00 2001
From: Tomas Vondra 
Date: Tue, 26 Dec 2023 17:54:40 +0100
Subject: [PATCH 1/2] Add tracking of backend memory allocated

Add tracking of backend memory allocated in total and by allocation
type (aset, dsm, generation, slab) by process.

allocated_bytes tracks the current bytes of memory allocated to the
backend process. aset_allocated_bytes, dsm_allocated_bytes,
generation_allocated_bytes and slab_allocated_bytes track the
allocation by type for the backend process. They are updated for the
process as memory is malloc'd/freed.  Memory allocated to items on
the freelist is included.  Dynamic shared memory allocations are
included only in the value displayed for the backend that created
them, they are not included in the value for backends that are
attached to them to avoid double counting. DSM allocations that are
not destroyed by the creating process prior to it's exit are
considered long lived and are tracked in a global counter
global_dsm_allocated_bytes. We limit the floor of allocation
counters to zero. Created views pg_stat_global_memory_allocation and
pg_stat_memory_allocation for access to these trackers.
---
 doc/src/sgml/monitoring.sgml| 246 
 src/backend/catalog/system_views.sql|  34 +++
 src/backend/storage/ipc/dsm.c   |  11 +-
 src/backend/storage/ipc/dsm_impl.c  |  78 +++
 src/backend/storage/lmgr/proc.c |   1 +
 src/backend/utils/activity/backend_status.c | 114 +
 src/backend/utils/adt/pgstatfuncs.c |  84 +++
 src/backend/utils/init/miscinit.c   |   3 +
 src/backend/utils/mmgr/aset.c   |  17 ++
 src/backend/utils/mmgr/generation.c |  13 ++
 src/backend/utils/mmgr/slab.c   |  22 ++
 src/include/catalog/pg_proc.dat |  17 ++
 src/include/storage/proc.h  |   2 +
 src/include/utils/backend_status.h  | 144 +++-
 src/test/regress/expected/rules.out |  27 +++
 src/test/regress/expected/stats.out |  36 +++
 src/test/regress/sql/stats.sql  |  20 ++
 17 files changed, 867 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 8736eac284..41d788be45 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -4599,6 +4599,252 @@ description | Waiting for a newly initialized WAL file to reach durable storage
 
  
 
+ 
+  pg_stat_memory_allocation
+
+  
+   pg_stat_memory_allocation
+  
+
+  
+   The pg_stat_memory_allocation view will have one
+   row per server process, showing information related to the current memory
+   allocation of that process in total and by allocator type. Due to the
+   dynamic nature of memory allocations the allocated bytes values may not be
+   exact but should be sufficient for the intended purposes. Dynamic shared
+   memory allocations are included only in the value displayed for the backend
+   that created them, they are not included in the value for backends that are
+   attached to them to avoid double counting.  Use
+   pg_size_pretty described in
+to make these values more easily
+   readable.
+  
+
+  
+   pg_stat_memory_allocation View
+   
+
+ 
+  
+   Column Type
+  
+  
+   Description
+  
+ 
+
+
+
+ 
+  
+   datid oid
+  
+  
+   OID of the database this backend is connected to
+  
+ 
+
+ 
+  
+   pid integer
+  
+  
+   Process ID of this backend
+  
+ 
+
+ 
+  
+   allocated_bytes bigint
+  
+ 
+  Memory currently allocated to this backend in bytes. This is the balance
+  of bytes allocated and freed by this backend. Dynamic shared memory
+  allocations are included only in the value displayed for the backend that
+  created them, they are not included in the value for backends that are
+  attached to them to avoid double counting.
+ 
+ 
+
+ 
+  
+   aset_allocated_bytes bigint
+  
+  
+   Memory currently allocated to this backend in bytes via the allocation
+   set allocator.
+  
+ 
+
+ 
+  
+   dsm_allocated_bytes bigint
+  
+  
+   Memory currently allocated to this backend in bytes via the dynamic
+   shared memory allocator. Upon process exit, dsm allocations that have
+   not been freed are considered long lived and added to
+   global_dsm_allocated_bytes

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2024-03-13 Thread Anton A. Melnikov


On 12.03.2024 16:30, Aleksander Alekseev wrote:


Just wanted to let you know that v20231226 doesn't apply. The patch
needs love from somebody interested in it.


Thanks for pointing to this!
Here is a version updated for the current master.

With the best regards,

--
Anton A. Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres CompanyFrom e31012217fb63f57fe16d7232fec27bb6cd45597 Mon Sep 17 00:00:00 2001
From: Tomas Vondra 
Date: Tue, 26 Dec 2023 17:54:40 +0100
Subject: [PATCH 1/3] Add tracking of backend memory allocated

Add tracking of backend memory allocated in total and by allocation
type (aset, dsm, generation, slab) by process.

allocated_bytes tracks the current bytes of memory allocated to the
backend process. aset_allocated_bytes, dsm_allocated_bytes,
generation_allocated_bytes and slab_allocated_bytes track the
allocation by type for the backend process. They are updated for the
process as memory is malloc'd/freed.  Memory allocated to items on
the freelist is included.  Dynamic shared memory allocations are
included only in the value displayed for the backend that created
them, they are not included in the value for backends that are
attached to them to avoid double counting. DSM allocations that are
not destroyed by the creating process prior to it's exit are
considered long lived and are tracked in a global counter
global_dsm_allocated_bytes. We limit the floor of allocation
counters to zero. Created views pg_stat_global_memory_allocation and
pg_stat_memory_allocation for access to these trackers.
---
 doc/src/sgml/monitoring.sgml| 246 
 src/backend/catalog/system_views.sql|  34 +++
 src/backend/storage/ipc/dsm.c   |  11 +-
 src/backend/storage/ipc/dsm_impl.c  |  78 +++
 src/backend/storage/lmgr/proc.c |   1 +
 src/backend/utils/activity/backend_status.c | 114 +
 src/backend/utils/adt/pgstatfuncs.c |  84 +++
 src/backend/utils/init/miscinit.c   |   3 +
 src/backend/utils/mmgr/aset.c   |  17 ++
 src/backend/utils/mmgr/generation.c |  14 ++
 src/backend/utils/mmgr/slab.c   |  22 ++
 src/include/catalog/pg_proc.dat |  17 ++
 src/include/storage/proc.h  |   2 +
 src/include/utils/backend_status.h  | 144 +++-
 src/test/regress/expected/rules.out |  27 +++
 src/test/regress/expected/stats.out |  36 +++
 src/test/regress/sql/stats.sql  |  20 ++
 17 files changed, 868 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 8aca08140e..5f827fe0b0 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -4598,6 +4598,252 @@ description | Waiting for a newly initialized WAL file to reach durable storage
 
  
 
+ 
+  pg_stat_memory_allocation
+
+  
+   pg_stat_memory_allocation
+  
+
+  
+   The pg_stat_memory_allocation view will have one
+   row per server process, showing information related to the current memory
+   allocation of that process in total and by allocator type. Due to the
+   dynamic nature of memory allocations the allocated bytes values may not be
+   exact but should be sufficient for the intended purposes. Dynamic shared
+   memory allocations are included only in the value displayed for the backend
+   that created them, they are not included in the value for backends that are
+   attached to them to avoid double counting.  Use
+   pg_size_pretty described in
+to make these values more easily
+   readable.
+  
+
+  
+   pg_stat_memory_allocation View
+   
+
+ 
+  
+   Column Type
+  
+  
+   Description
+  
+ 
+
+
+
+ 
+  
+   datid oid
+  
+  
+   OID of the database this backend is connected to
+  
+ 
+
+ 
+  
+   pid integer
+  
+  
+   Process ID of this backend
+  
+ 
+
+ 
+  
+   allocated_bytes bigint
+  
+ 
+  Memory currently allocated to this backend in bytes. This is the balance
+  of bytes allocated and freed by this backend. Dynamic shared memory
+  allocations are included only in the value displayed for the backend that
+  created them, they are not included in the value for backends that are
+  attached to them to avoid double counting.
+ 
+ 
+
+ 
+  
+   aset_allocated_bytes bigint
+  
+  
+   Memory currently allocated to this backend in bytes via the allocation
+   set allocator.
+  
+ 
+
+ 
+  
+   dsm_allocated_bytes bigint
+  
+  
+   Memory currently allocated to this backend in bytes via the dynamic
+   shared memory allocator. Upon process exit, dsm allocations that have
+   not been freed are considered long lived and added to
+   global_dsm_allocated_bytes found in the
+   
+

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2024-03-12 Thread Aleksander Alekseev

Hi,

> I took a closer look at this patch over the last couple days, and I did
> a bunch of benchmarks to see how expensive the accounting is. The good
> news is that I haven't observed any overhead - I did two simple tests,
> that I think would be particularly sensitive to this:
>
> [...]

Just wanted to let you know that v20231226 doesn't apply. The patch
needs love from somebody interested in it.

Best regards,
Aleksander Alekseev (wearing a co-CFM hat)

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-12-26 Thread Tomas Vondra

Hi,

I wanted to take a look at the patch, and I noticed it's broken since
3d51cb5197 renamed a couple pgstat functions in August. I plan to maybe
do some benchmarks etc. preferably on current master, so here's a
version fixing that minor bitrot.

As for the patch, I only skimmed through the thread so far, to get some
idea of what the approach and goals are, etc. I didn't look at the code
yet, so can't comment on that.

However, at pgconf.eu a couple week ago I had quite a few discussions
about such "backend memory limit" could/should work in principle, and
I've been thinking about ways to implement this. So let me share some
thoughts about how this patch aligns with that ...

(FWIW it's not my intent to hijack or derail this patch in any way, but
there's a couple things I think we should do differently.)

I'm 100% on board with having a memory limit "above" work_mem. It's
really annoying that we have no way to restrict the amount of memory a
backend can allocate for complex queries, etc.

But I find it a bit strange that we aim to introduce a "global" memory
limit for all backends combined first. I'm not against having that too,
but it's not the feature I usually wish to have. I need some protection
against runaway backends, that happen to allocate a lot memory.

Similarly, I'd like to be able to have different limits depending on
what the backend does - a backend doing OLAP may naturally need more
memory, while a backend doing OLTP may have a much tighter limit.

But with a single global limit none of this is possible. It may help
reducing the risk of unexpected OOM issues (not 100%, but useful), but
it can't limit the impact to the one backend - if memory starts runnning
out, it will affect all other backends a bit randomly (depending on the
order in which the backends happen to allocate memory). And it does not
consider what workloads the backends execute.

Let me propose a slightly different architecture that I imagined while
thinking about this. It's not radically differrent from what the patch
does, but it focuses on the local accounting first. I believe it's
possible to extend this to enforce the global limit too.

FWIW I haven't tried implementing this - I don't want to "hijack" this
thread and do my own thing. I can take a stab at a PoC if needed.

Firstly, I'm not quite happy with how all the memory contexts have to
do their own version of the accounting and memory checking. I think we
should move that into a new abstraction which I call "memory pool".
It's very close to "memory context" but it only deals with allocating
blocks, not the chunks requested by palloc() etc. So when someone does
palloc(), that may be AllocSetAlloc(). And instead of doing malloc()
that would do MemoryPoolAlloc(blksize), and then that would do all the
accounting and checks, and then do malloc().

This may sound like an unnecessary indirection, but the idea is that a
single memory pool would back many memory contexts (perhaps all for
a given backend). In principle we might even establish separate memory
pools for different parts of the memory context hierarchy, but I'm not
sure we need that.

I can imagine the pool could also cache blocks for cases when we create
and destroy contexts very often, but glibc should already does that for
us, I believe.

For me, the accounting and memory context is the primary goal. I wonder
if we considered this context/pool split while working on the accounting
for hash aggregate, but I think we were too attached to doing all of it
in the memory context hierarchy.

Of course, this memory pool is per backend, and so would be the memory
accounting and limit enforced by it. But I can imagine extending to do
a global limit similar to what the current patch does - using a counter
in shared memory, or something. I haven't reviewed what's the overhead
or how it handles cases when a backend terminates in some unexpected
way. But whatever the current patch does, memory pool could do too.


Secondly, I think there's an annoying issue with the accounting at the
block level - it makes it problematic to use low limit values. We double
the block size, so we may quickly end up with a block size a couple MBs,
which means the accounting granularity gets very coarse.

I think it'd be useful to introduce a "backpressure" between the memory
pool and the memory context, depending on how close we are to the limit.
For example if the limit is 128MB and the backend allocated 16MB so far,
we're pretty far away from the limit. So if the backend requests 8MB
block, that's fine and the memory pool should malloc() that. But if we
already allocated 100MB, maybe we should be careful and not allow 8MB
blocks - the memory pool should be allowed to override this and return
just 1MB block. Sure, this would have to be optional, and not all places
can accept a smaller block than requested (when the chunk would not fit
into the smaller block). It would require a suitable memory pool API
and more work in the memory contexts,

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-12-26 Thread Tomas Vondra

On 12/26/23 11:49, Anton A. Melnikov wrote:
> Hello!
> 
> Earlier in this thread, the pgbench results were published, where with a
> strong memory limit of 100MB
> a significant, about 10%, decrease in TPS was observed [1].
> 
> Using dedicated server with 12GB RAM and methodology described in [3], i
> performed five series
> of measurements for the patches from the [2].

Can you share some info about the hardware? For example the CPU model,
number of cores, and so on. 12GB RAM is not quite huge, so presumably it
was a small machine.

> The series were like this:
> 1) unpatched 16th version at the REL_16_BETA1 (e0b82fc8e83) as close to
> [2] in time.
> 2) patched REL_16_BETA1 at e0b82fc8e83 with undefined
> max_total_backend_memory GUC (with default value = 0).
> 3) patched REL_16_BETA1 with max_total_backend_memory = 16GB
> 4) the same with max_total_backend_memory = 8GB
> 5) and again with max_total_backend_memory = 200MB
> 

OK

> Measurements with max_total_backend_memory = 100MB were not be carried out,
> with limit 100MB the server gave an error on startup:
> FATAL:  configured max_total_backend_memory 100MB is <=
> shared_memory_size 143MB
> So i used 200MB to retain all other GUCs the same.
> 

I'm not very familiar with the patch yet, but this seems a bit strange.
Why should shared_buffers be included this limit?

> Pgbench gave the following results:
> 1) and 2) almost the same: ~6350 TPS. See orange and green
> distributions on the attached graph.png respectively.
> 3) and 4) identical to each other (~6315 TPS) and a bit slower than 1)
> and 2) by ~0,6%.
> See blue and yellow distributions respectively.
> 5) is slightly slower (~6285 TPS) than 3) and 4) by another 0,5%. (grey
> distribution)
> The standard error in all series was ~0.2%. There is a raw data in the
> raw_data.txt.
> 

I think 6350 is a pretty terrible number, especially for scale 8, which
is maybe 150MB of data. I think that's a pretty clear sign the system
was hitting some other bottleneck, which can easily mask regressions in
the memory allocation code. AFAICS the pgbench runs were regular r/w
benchmarks, so I'd bet it was hitting I/O, and possibly even subject to
some random effects at that level.

I think what would be interesting are runs with

   pgbench -M prepared -S -c $N -j $N

i.e. read-only tests (to not hit I/O), and $N being sufficiently large
to maybe also show some concurrency/locking bottlenecks, etc.

I may do some benchmarks if I happen to find a bit of time, but maybe
you could collect such numbers too?

The other benchmark that might be interesting is more OLAP, with low
concurrency but backends allocating a lot of memory.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-11-10 Thread jian he

hi.

+static void checkAllocations();
should be "static void checkAllocations(void);" ?

PgStatShared_Memtrack there is a lock, but seems not initialized, and
not used. Can you expand on it?
So in view  pg_stat_global_memory_tracking, column
"total_memory_reserved" is a point of time, total memory the whole
server reserved/malloced? will it change every time you call it?
the function pg_stat_get_global_memory_tracking provolatile => 's'.
should be a VOLATILE function?


pg_stat_get_memory_reservation, pg_stat_get_global_memory_tracking
should be proretset => 'f'.
+{ oid => '9891',
+  descr => 'statistics: memory utilized by current backend',
+  proname => 'pg_get_backend_memory_allocation', prorows => '1',
proisstrict => 'f',
+  proretset => 't', provolatile => 's', proparallel => 'r',


you declared
+void pgstat_backend_memory_reservation_cb(void);
but seems there is no definition.


this part is unnecessary since you already declared
src/include/catalog/pg_proc.dat?
+/* SQL Callable functions */
+extern Datum pg_stat_get_memory_reservation(PG_FUNCTION_ARGS);
+extern Datum pg_get_backend_memory_allocation(PG_FUNCTION_ARGS);
+extern Datum pg_stat_get_global_memory_tracking(PG_FUNCTION_ARGS);

The last sentence is just a plain link, no explanation. something is missing?
 
+  Reports how much memory remains available to the server. If a
+  backend process attempts to allocate more memory than remains,
+  the process will fail with an out of memory error, resulting in
+  cancellation of the process's active query/transaction.
+  If memory is not being limited (ie. max_total_memory is zero or not set),
+  this column returns NULL.
+  .
+ 
+ 
+
+  
+   
+static_shared_memory bigint
+   
+  
+   Reports how much static shared memory (non-DSM shared memory)
is being used by
+   the server. Static shared memory is configured by the postmaster at
+   at server startup.
+   .
+  
+

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-11-08 Thread Andres Freund

Hi,

On 2023-11-07 15:55:48 -0500, Stephen Frost wrote:
> * Andres Freund (and...@anarazel.de) wrote:
> > On 2023-11-06 13:02:50 -0500, Stephen Frost wrote:
> > > > > The max_total_memory limit is checked whenever the global counters are
> > > > > updated. There is no special error handling if a memory allocation 
> > > > > exceeds
> > > > > the global limit. That allocation returns a NULL for malloc style
> > > > > allocations or an ENOMEM for shared memory allocations. Postgres has
> > > > > existing mechanisms for dealing with out of memory conditions.
> > > > 
> > > > I still think it's extremely unwise to do tracking of memory and 
> > > > limiting of
> > > > memory in one patch.  You should work towards and acceptable patch that 
> > > > just
> > > > tracks memory usage in an as simple and low overhead way as possible. 
> > > > Then we
> > > > later can build on that.
> > > 
> > > Frankly, while tracking is interesting, the limiting is the feature
> > > that's needed more urgently imv.
> > 
> > I agree that we need limiting, but that the tracking needs to be very robust
> > for that to be usable.
> 
> Is there an issue with the tracking in the patch that you saw?  That's
> certainly an area that we've tried hard to get right and to match up to
> numbers from the rest of the system, such as the memory context system.

There's some details I am pretty sure aren't right - the DSM tracking piece
seems bogus to me. But beyond that: I don't know. There's enough other stuff
in the patch that it's hard to focus on that aspect. That's why I'd like to
merge a patch doing just that, so we actually can collect numbers. If any of
the developers of the patch had focused on polishing that part instead of
focusing on the limiting, it'd have been ready to be merged a while ago, maybe
even in 16.  I think the limiting piece is unlikely to be ready for 17.

Greetings,

Andres Freund

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-11-07 Thread Stephen Frost

Greetings,

* Andres Freund (and...@anarazel.de) wrote:
> On 2023-11-06 13:02:50 -0500, Stephen Frost wrote:
> > > > The max_total_memory limit is checked whenever the global counters are
> > > > updated. There is no special error handling if a memory allocation 
> > > > exceeds
> > > > the global limit. That allocation returns a NULL for malloc style
> > > > allocations or an ENOMEM for shared memory allocations. Postgres has
> > > > existing mechanisms for dealing with out of memory conditions.
> > > 
> > > I still think it's extremely unwise to do tracking of memory and limiting 
> > > of
> > > memory in one patch.  You should work towards and acceptable patch that 
> > > just
> > > tracks memory usage in an as simple and low overhead way as possible. 
> > > Then we
> > > later can build on that.
> > 
> > Frankly, while tracking is interesting, the limiting is the feature
> > that's needed more urgently imv.
> 
> I agree that we need limiting, but that the tracking needs to be very robust
> for that to be usable.

Is there an issue with the tracking in the patch that you saw?  That's
certainly an area that we've tried hard to get right and to match up to
numbers from the rest of the system, such as the memory context system.

> > We could possibly split it up but there's an awful lot of the same code that
> > would need to be changed and that seems less than ideal.  Still, we'll look
> > into this.
> 
> Shrug. IMO keeping them together just makes it very likely that neither goes
> in.

I'm happy to hear your support for the limiting part of this- that's
encouraging.

> > > > For sanity checking, pgstat now includes the 
> > > > pg_backend_memory_allocation
> > > > view showing memory allocations made by the backend process. This view
> > > > includes a scan of the top memory context, so it compares memory 
> > > > allocations
> > > > reported through pgstat with actual allocations. The two should match.
> > > 
> > > Can't you just do this using the existing pg_backend_memory_contexts view?
> > 
> > Not and get a number that you can compare to the local backend number
> > due to the query itself happening and performing allocations and
> > creating new contexts.  We wanted to be able to show that we are
> > accounting correctly and exactly matching to what the memory context
> > system is tracking.
> 
> I think creating a separate view for this will be confusing for users, without
> really much to show for. Excluding the current query would be useful for other
> cases as well, why don't we provide a way to do that with
> pg_backend_memory_contexts?

Both of these feel very much like power-user views, so I'm not terribly
concerned about users getting confused.  That said, we could possibly
drop this as a view and just have the functions which are then used in
the regression tests to catch things should the numbers start to
diverge.

Having a way to get the memory contexts which don't include the
currently running query might be interesting too but is rather
independent of what this patch is trying to do.  The only reason we
collected up the memory-context info is as a cross-check to the tracking
that we're doing and while the existing memory-context view is just fine
for a lot of other things, it doesn't work for that specific need.

Thanks,

Stephen


signature.asc
Description: PGP signature

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-11-07 Thread Andres Freund

Hi,

On 2023-11-06 13:02:50 -0500, Stephen Frost wrote:
> > > The max_total_memory limit is checked whenever the global counters are
> > > updated. There is no special error handling if a memory allocation exceeds
> > > the global limit. That allocation returns a NULL for malloc style
> > > allocations or an ENOMEM for shared memory allocations. Postgres has
> > > existing mechanisms for dealing with out of memory conditions.
> > 
> > I still think it's extremely unwise to do tracking of memory and limiting of
> > memory in one patch.  You should work towards and acceptable patch that just
> > tracks memory usage in an as simple and low overhead way as possible. Then 
> > we
> > later can build on that.
> 
> Frankly, while tracking is interesting, the limiting is the feature
> that's needed more urgently imv.

I agree that we need limiting, but that the tracking needs to be very robust
for that to be usable.


> We could possibly split it up but there's an awful lot of the same code that
> would need to be changed and that seems less than ideal.  Still, we'll look
> into this.

Shrug. IMO keeping them together just makes it very likely that neither goes
in.


> > > For sanity checking, pgstat now includes the pg_backend_memory_allocation
> > > view showing memory allocations made by the backend process. This view
> > > includes a scan of the top memory context, so it compares memory 
> > > allocations
> > > reported through pgstat with actual allocations. The two should match.
> > 
> > Can't you just do this using the existing pg_backend_memory_contexts view?
> 
> Not and get a number that you can compare to the local backend number
> due to the query itself happening and performing allocations and
> creating new contexts.  We wanted to be able to show that we are
> accounting correctly and exactly matching to what the memory context
> system is tracking.

I think creating a separate view for this will be confusing for users, without
really much to show for. Excluding the current query would be useful for other
cases as well, why don't we provide a way to do that with
pg_backend_memory_contexts?

Greetings,

Andres Freund

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-11-06 Thread Stephen Frost

Greetings,

* Andres Freund (and...@anarazel.de) wrote:
> On 2023-10-31 17:11:26 +, John Morris wrote:
> > Postgres memory reservations come from multiple sources.
> > 
> >   *   Malloc calls made by the Postgres memory allocators.
> >   *   Static shared memory created by the postmaster at server startup,
> >   *   Dynamic shared memory created by the backends.
> >   *   A fixed amount (1Mb) of “initial” memory reserved whenever a process 
> > starts up.
> > 
> > Each process also maintains an accurate count of its actual memory
> > allocations. The process-private variable “my_memory” holds the total
> > allocations for that process. Since there can be no contention, each process
> > updates its own counters very efficiently.
> 
> I think this will introduce measurable overhead in low concurrency cases and
> very substantial overhead / contention when there is a decent amount of
> concurrency.  This makes all memory allocations > 1MB contend on a single
> atomic.  Massive amount of energy have been spent writing multi-threaded
> allocators that have far less contention than this - the current state is to
> never contend on shared resources on any reasonably common path. This gives
> away one of the few major advantages our process model has away.

We could certainly adjust the size of each reservation to reduce the
frequency of having to hit the atomic.  Specific suggestions about how
to benchmark and see the regression that's being worried about here
would be great.  Certainly my hope has generally been that when we do a
larger allocation, it's because we're about to go do a bunch of work,
meaning that hopefully the time spent updating the atomic is minor
overall.

> The patch doesn't just introduce contention when limiting is enabled - it
> introduces it even when memory usage is just tracked. It makes absolutely no
> sense to have a single contended atomic in that case - just have a per-backend
> variable in shared memory that's updated. It's *WAY* cheaper to compute the
> overall memory usage during querying than to keep a running total in shared
> memory.

Agreed that we should avoid the contention when limiting isn't being
used, certainly easy to do so, and had actually intended to but that
seems to have gotten lost along the way.  Will fix.

Other than that change inside update_global_reservation though, the code
for reporting per-backend memory usage and querying it does work as
you're outlining above inside the stats system.

That said- I just want to confirm that you would agree that querying the
amount of memory used by every backend, to add it all up to enforce an
overall limit, surely isn't something we're going to want to do during
an allocation and that having a global atomic for that is better, right?

> > Pgstat now includes global memory counters. These shared memory counters
> > represent the sum of all reservations made by all Postgres processes. For
> > efficiency, these global counters are only updated when new reservations
> > exceed a threshold, currently 1 Mb for each process. Consequently, the
> > global reservation counters are approximate totals which may differ from the
> > actual allocation totals by up to 1 Mb per process.
> 
> I see that you added them to the "cumulative" stats system - that doesn't
> immediately makes sense to me - what you're tracking here isn't an
> accumulating counter, it's something showing the current state, right?

Yes, this is current state, not an accumulation.

> > The max_total_memory limit is checked whenever the global counters are
> > updated. There is no special error handling if a memory allocation exceeds
> > the global limit. That allocation returns a NULL for malloc style
> > allocations or an ENOMEM for shared memory allocations. Postgres has
> > existing mechanisms for dealing with out of memory conditions.
> 
> I still think it's extremely unwise to do tracking of memory and limiting of
> memory in one patch.  You should work towards and acceptable patch that just
> tracks memory usage in an as simple and low overhead way as possible. Then we
> later can build on that.

Frankly, while tracking is interesting, the limiting is the feature
that's needed more urgently imv.  We could possibly split it up but
there's an awful lot of the same code that would need to be changed and
that seems less than ideal.  Still, we'll look into this.

> > For sanity checking, pgstat now includes the pg_backend_memory_allocation
> > view showing memory allocations made by the backend process. This view
> > includes a scan of the top memory context, so it compares memory allocations
> > reported through pgstat with actual allocations. The two should match.
> 
> Can't you just do this using the existing pg_backend_memory_contexts view?

Not and get a number that you can compare to the local backend number
due to the query itself happening and performing allocations and
creating new contexts.  We wanted to be able to show that we are
accounting

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-11-03 Thread Andres Freund

Hi,

On 2023-10-31 17:11:26 +, John Morris wrote:
> Postgres memory reservations come from multiple sources.
> 
>   *   Malloc calls made by the Postgres memory allocators.
>   *   Static shared memory created by the postmaster at server startup,
>   *   Dynamic shared memory created by the backends.
>   *   A fixed amount (1Mb) of “initial” memory reserved whenever a process 
> starts up.
> 
> Each process also maintains an accurate count of its actual memory
> allocations. The process-private variable “my_memory” holds the total
> allocations for that process. Since there can be no contention, each process
> updates its own counters very efficiently.

I think this will introduce measurable overhead in low concurrency cases and
very substantial overhead / contention when there is a decent amount of
concurrency.  This makes all memory allocations > 1MB contend on a single
atomic.  Massive amount of energy have been spent writing multi-threaded
allocators that have far less contention than this - the current state is to
never contend on shared resources on any reasonably common path. This gives
away one of the few major advantages our process model has away.

The patch doesn't just introduce contention when limiting is enabled - it
introduces it even when memory usage is just tracked. It makes absolutely no
sense to have a single contended atomic in that case - just have a per-backend
variable in shared memory that's updated. It's *WAY* cheaper to compute the
overall memory usage during querying than to keep a running total in shared
memory.

> Pgstat now includes global memory counters. These shared memory counters
> represent the sum of all reservations made by all Postgres processes. For
> efficiency, these global counters are only updated when new reservations
> exceed a threshold, currently 1 Mb for each process. Consequently, the
> global reservation counters are approximate totals which may differ from the
> actual allocation totals by up to 1 Mb per process.

I see that you added them to the "cumulative" stats system - that doesn't
immediately makes sense to me - what you're tracking here isn't an
accumulating counter, it's something showing the current state, right?

> The max_total_memory limit is checked whenever the global counters are
> updated. There is no special error handling if a memory allocation exceeds
> the global limit. That allocation returns a NULL for malloc style
> allocations or an ENOMEM for shared memory allocations. Postgres has
> existing mechanisms for dealing with out of memory conditions.

I still think it's extremely unwise to do tracking of memory and limiting of
memory in one patch.  You should work towards and acceptable patch that just
tracks memory usage in an as simple and low overhead way as possible. Then we
later can build on that.

> For sanity checking, pgstat now includes the pg_backend_memory_allocation
> view showing memory allocations made by the backend process. This view
> includes a scan of the top memory context, so it compares memory allocations
> reported through pgstat with actual allocations. The two should match.

Can't you just do this using the existing pg_backend_memory_contexts view?

> Performance-wise, there was no measurable impact with either pgbench or a
> simple “SELECT * from series” query.

That seems unsurprising - allocations aren't a major part of the work there,
you'd have to regress by a lot to see memory allocator changes to show a
significant performance decrease.

> diff --git a/src/test/regress/expected/opr_sanity.out 
> b/src/test/regress/expected/opr_sanity.out
> index 7a6f36a6a9..6c813ec465 100644
> --- a/src/test/regress/expected/opr_sanity.out
> +++ b/src/test/regress/expected/opr_sanity.out
> @@ -468,9 +468,11 @@ WHERE proallargtypes IS NOT NULL AND
>ARRAY(SELECT proallargtypes[i]
>  FROM generate_series(1, array_length(proallargtypes, 1)) g(i)
>  WHERE proargmodes IS NULL OR proargmodes[i] IN ('i', 'b', 'v'));
> - oid | proname | proargtypes | proallargtypes | proargmodes 
> --+-+-++-
> -(0 rows)
> + oid  | proname  | proargtypes |  proallargtypes 
>   |proargmodes
> +--+--+-+---+---
> + 9890 | pg_stat_get_memory_reservation   | | 
> {23,23,20,20,20,20,20,20} | {i,o,o,o,o,o,o,o}
> + 9891 | pg_get_backend_memory_allocation | | 
> {23,23,20,20,20,20,20}| {i,o,o,o,o,o,o}
> +(2 rows)

This indicates that your pg_proc entries are broken, they need to fixed rather
than allowed here.

Greetings,

Andres Freund

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-10-23 Thread Andres Freund

Hi,

On 2023-10-24 09:39:42 +0700, Andrei Lepikhov wrote:
> On 20/10/2023 19:39, Stephen Frost wrote:
> Greetings,
> > * Andrei Lepikhov (a.lepik...@postgrespro.ru) wrote:
> > > The only issue I worry about is the uncertainty and clutter that can be
> > > created by this feature. In the worst case, when we have a complex error
> > > stack (including the extension's CATCH sections, exceptions in stored
> > > procedures, etc.), the backend will throw the memory limit error 
> > > repeatedly.
> > 
> > I'm not seeing what additional uncertainty or clutter there is- this is,
> > again, exactly the same as what happens today on a system with
> > overcommit disabled and I don't feel like we get a lot of complaints
> > about this today.
> 
> Maybe I missed something or see this feature from an alternate point of view
> (as an extension developer), but overcommit is more useful so far: it kills
> a process.

In case of postgres it doesn't just kill one postgres, it leads to *all*
connections being terminated.


> It means that after restart, the backend/background worker will have an
> initial internal state. With this limit enabled, we need to remember that
> each function call can cause an error, and we have to remember it using
> static PG_CATCH sections where we must rearrange local variables to the
> initial (?) state. So, it complicates development.

You need to be aware of errors being thrown regardless this feature, as
out-of-memory errors can be encountered today already. There also are many
other kinds of errors that can be thrown.

Greetings,

Andres Freund

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-10-23 Thread Andrei Lepikhov


On 20/10/2023 19:39, Stephen Frost wrote:
Greetings,

* Andrei Lepikhov (a.lepik...@postgrespro.ru) wrote:

The only issue I worry about is the uncertainty and clutter that can be
created by this feature. In the worst case, when we have a complex error
stack (including the extension's CATCH sections, exceptions in stored
procedures, etc.), the backend will throw the memory limit error repeatedly.


I'm not seeing what additional uncertainty or clutter there is- this is,
again, exactly the same as what happens today on a system with
overcommit disabled and I don't feel like we get a lot of complaints
about this today.


Maybe I missed something or see this feature from an alternate point of 
view (as an extension developer), but overcommit is more useful so far: 
it kills a process.
It means that after restart, the backend/background worker will have an 
initial internal state. With this limit enabled, we need to remember 
that each function call can cause an error, and we have to remember it 
using static PG_CATCH sections where we must rearrange local variables 
to the initial (?) state. So, it complicates development.
Of course, this limit is a good feature, but from my point of view, it 
would be better to kill a memory-consuming backend instead of throwing 
an error. At least for now, we don't have a technique to repeat query 
planning with chances to build a more effective plan.


--
regards,
Andrei Lepikhov
Postgres Professional

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-10-20 Thread Stephen Frost

Greetings,

* Andrei Lepikhov (a.lepik...@postgrespro.ru) wrote:
> The only issue I worry about is the uncertainty and clutter that can be
> created by this feature. In the worst case, when we have a complex error
> stack (including the extension's CATCH sections, exceptions in stored
> procedures, etc.), the backend will throw the memory limit error repeatedly.

I'm not seeing what additional uncertainty or clutter there is- this is,
again, exactly the same as what happens today on a system with
overcommit disabled and I don't feel like we get a lot of complaints
about this today.

> Of course, one failed backend looks better than a surprisingly killed
> postmaster, but the mix of different error reports and details looks
> terrible and challenging to debug in the case of trouble. So, may we throw a
> FATAL error if we reach this limit while handling an exception?

I don't see why we'd do that when we can do better- we just fail
whatever the ongoing query or transaction is and allow further requests
on the same connection.  We already support exactly that and it works
really rather well and I don't see why we'd throw that away because
there's a different way to get an OOM error.

If you want to make the argument that we should throw FATAL on OOM when
handling an exception, that's something you could argue independently of
this effort already today, but I don't think you'll get agreement that
it's an improvement.

Thanks,

Stephen

signature.asc
Description: PGP signature

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-10-19 Thread Andrei Lepikhov


On 20/10/2023 05:06, Stephen Frost wrote:

Greetings,

* Andrei Lepikhov (a.lepik...@postgrespro.ru) wrote:

On 19/10/2023 02:00, Stephen Frost wrote:

* Andrei Lepikhov (a.lepik...@postgrespro.ru) wrote:

On 29/9/2023 09:52, Andrei Lepikhov wrote:

On 22/5/2023 22:59, reid.thomp...@crunchydata.com wrote:

Attach patches updated to master.
Pulled from patch 2 back to patch 1 a change that was also pertinent
to patch 1.

+1 to the idea, have doubts on the implementation.

I have a question. I see the feature triggers ERROR on the exceeding of
the memory limit. The superior PG_CATCH() section will handle the error.
As I see, many such sections use memory allocations. What if some
routine, like the CopyErrorData(), exceeds the limit, too? In this case,
we could repeat the error until the top PG_CATCH(). Is this correct
behaviour? Maybe to check in the exceeds_max_total_bkend_mem() for
recursion and allow error handlers to slightly exceed this hard limit?



By the patch in attachment I try to show which sort of problems I'm worrying
about. In some PП_CATCH() sections we do CopyErrorData (allocate some
memory) before aborting the transaction. So, the allocation error can move
us out of this section before aborting. We await for soft ERROR message but
will face more hard consequences.


While it's an interesting idea to consider making exceptions to the
limit, and perhaps we'll do that (or have some kind of 'reserve' for
such cases), this isn't really any different than today, is it?  We
might have a malloc() failure in the main path, end up in PG_CATCH() and
then try to do a CopyErrorData() and have another malloc() failure.

If we can rearrange the code to make this less likely to happen, by
doing a bit more work to free() resources used in the main path before
trying to do new allocations, then, sure, let's go ahead and do that,
but that's independent from this effort.


I agree that rearranging efforts can be made independently. The code in the
letter above was shown just as a demo of the case I'm worried about.
IMO, the thing that should be implemented here is a recursion level for the
memory limit. If processing the error, we fall into recursion with this
limit - we should ignore it.
I imagine custom extensions that use PG_CATCH() and allocate some data
there. At least we can raise the level of error to FATAL.


Ignoring such would defeat much of the point of this effort- which is to
get to a position where we can say with some confidence that we're not
going to go over some limit that the user has set and therefore not
allow ourselves to end up getting OOM killed.  These are all the same
issues that already exist today on systems which don't allow overcommit
too, there isn't anything new here in regards to these risks, so I'm not
really keen to complicate this to deal with issues that are already
there.

Perhaps once we've got the basics in place then we could consider
reserving some space for handling such cases..  but I don't think it'll
actually be very clean and what if we have an allocation that goes
beyond what that reserved space is anyway?  Then we're in the same spot
again where we have the choice of either failing the allocation in a
less elegant way than we might like to handle that error, or risk
getting outright kill'd by the kernel.  Of those choices, sure seems
like failing the allocation is the better way to go.


I've got your point.
The only issue I worry about is the uncertainty and clutter that can be 
created by this feature. In the worst case, when we have a complex error 
stack (including the extension's CATCH sections, exceptions in stored 
procedures, etc.), the backend will throw the memory limit error 
repeatedly. Of course, one failed backend looks better than a 
surprisingly killed postmaster, but the mix of different error reports 
and details looks terrible and challenging to debug in the case of 
trouble. So, may we throw a FATAL error if we reach this limit while 
handling an exception?


--
regards,
Andrey Lepikhov
Postgres Professional

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-10-19 Thread Stephen Frost

Greetings,

* Andres Freund (and...@anarazel.de) wrote:
> On 2023-10-19 18:06:10 -0400, Stephen Frost wrote:
> > Ignoring such would defeat much of the point of this effort- which is to
> > get to a position where we can say with some confidence that we're not
> > going to go over some limit that the user has set and therefore not
> > allow ourselves to end up getting OOM killed.
> 
> I think that is a good medium to long term goal. I do however think that we'd
> be better off merging the visibility of memory allocations soon-ish and
> implement the limiting later. There's a lot of hairy details to get right for
> the latter, and even just having visibility will be a huge improvement.

I agree that having the visibility will be a great improvement and
perhaps could go in separately, but I don't know that I agree that the
limits are going to be that much of an issue.  In any case, there's been
work ongoing on this and that'll be posted soon.  I was just trying to
address the general comment raised in this sub-thread here.

> I think even patch 1 is doing too much at once. I doubt the DSM stuff is
> quite right.

Getting DSM right has certainly been tricky, along with other things,
but we've been working towards, and continue to work towards, getting
everything to line up nicely between memory context allocations of
various types and the amounts which are being seen as malloc'd/free'd.
There's been parts of this also reworked to allow us to see per-backend
reservations as well as total reserved and to get those numbers able to
be matched up inside of a given transaction using the statistics system.

> I'm unconvinced it's a good idea to split the different types of memory
> contexts out. That just exposes too much implementation detail stuff without a
> good reason.

DSM needs to be independent anyway ... as for the others, perhaps we
could combine them, though that's pretty easily done later and for now
it's been useful to see them split out as we've been working on the
patch.

> I think the overhead even just the tracking implies right now is likely too
> high and needs to be optimized. It should be a single math operation, not
> tracking things in multiple fields. I don't think pg_sub_u64_overflow() should
> be in the path either, that suddenly adds conditional branches.  You really
> ought to look at the difference in assembly for the hot functions.

This has been improved in the most recent work and we'll have that
posted soon, probably best to hold off from larger review of this right
now- as mentioned, I was just trying to address the specific question in
this sub-thread since a new patch is coming soon.

Thanks,

Stephen

signature.asc
Description: PGP signature

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-10-19 Thread Andres Freund

Hi,

On 2023-10-19 18:06:10 -0400, Stephen Frost wrote:
> Ignoring such would defeat much of the point of this effort- which is to
> get to a position where we can say with some confidence that we're not
> going to go over some limit that the user has set and therefore not
> allow ourselves to end up getting OOM killed.

I think that is a good medium to long term goal. I do however think that we'd
be better off merging the visibility of memory allocations soon-ish and
implement the limiting later. There's a lot of hairy details to get right for
the latter, and even just having visibility will be a huge improvement.

I think even patch 1 is doing too much at once. I doubt the DSM stuff is
quite right.

I'm unconvinced it's a good idea to split the different types of memory
contexts out. That just exposes too much implementation detail stuff without a
good reason.

I think the overhead even just the tracking implies right now is likely too
high and needs to be optimized. It should be a single math operation, not
tracking things in multiple fields. I don't think pg_sub_u64_overflow() should
be in the path either, that suddenly adds conditional branches.  You really
ought to look at the difference in assembly for the hot functions.

Greetings,

Andres Freund

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-10-19 Thread Stephen Frost

Greetings,

* Andrei Lepikhov (a.lepik...@postgrespro.ru) wrote:
> On 19/10/2023 02:00, Stephen Frost wrote:
> > * Andrei Lepikhov (a.lepik...@postgrespro.ru) wrote:
> > > On 29/9/2023 09:52, Andrei Lepikhov wrote:
> > > > On 22/5/2023 22:59, reid.thomp...@crunchydata.com wrote:
> > > > > Attach patches updated to master.
> > > > > Pulled from patch 2 back to patch 1 a change that was also pertinent
> > > > > to patch 1.
> > > > +1 to the idea, have doubts on the implementation.
> > > > 
> > > > I have a question. I see the feature triggers ERROR on the exceeding of
> > > > the memory limit. The superior PG_CATCH() section will handle the error.
> > > > As I see, many such sections use memory allocations. What if some
> > > > routine, like the CopyErrorData(), exceeds the limit, too? In this case,
> > > > we could repeat the error until the top PG_CATCH(). Is this correct
> > > > behaviour? Maybe to check in the exceeds_max_total_bkend_mem() for
> > > > recursion and allow error handlers to slightly exceed this hard limit?
> > 
> > > By the patch in attachment I try to show which sort of problems I'm 
> > > worrying
> > > about. In some PП_CATCH() sections we do CopyErrorData (allocate some
> > > memory) before aborting the transaction. So, the allocation error can move
> > > us out of this section before aborting. We await for soft ERROR message 
> > > but
> > > will face more hard consequences.
> > 
> > While it's an interesting idea to consider making exceptions to the
> > limit, and perhaps we'll do that (or have some kind of 'reserve' for
> > such cases), this isn't really any different than today, is it?  We
> > might have a malloc() failure in the main path, end up in PG_CATCH() and
> > then try to do a CopyErrorData() and have another malloc() failure.
> > 
> > If we can rearrange the code to make this less likely to happen, by
> > doing a bit more work to free() resources used in the main path before
> > trying to do new allocations, then, sure, let's go ahead and do that,
> > but that's independent from this effort.
> 
> I agree that rearranging efforts can be made independently. The code in the
> letter above was shown just as a demo of the case I'm worried about.
> IMO, the thing that should be implemented here is a recursion level for the
> memory limit. If processing the error, we fall into recursion with this
> limit - we should ignore it.
> I imagine custom extensions that use PG_CATCH() and allocate some data
> there. At least we can raise the level of error to FATAL.

Ignoring such would defeat much of the point of this effort- which is to
get to a position where we can say with some confidence that we're not
going to go over some limit that the user has set and therefore not
allow ourselves to end up getting OOM killed.  These are all the same
issues that already exist today on systems which don't allow overcommit
too, there isn't anything new here in regards to these risks, so I'm not
really keen to complicate this to deal with issues that are already
there.

Perhaps once we've got the basics in place then we could consider
reserving some space for handling such cases..  but I don't think it'll
actually be very clean and what if we have an allocation that goes
beyond what that reserved space is anyway?  Then we're in the same spot
again where we have the choice of either failing the allocation in a
less elegant way than we might like to handle that error, or risk
getting outright kill'd by the kernel.  Of those choices, sure seems
like failing the allocation is the better way to go.

Thanks,

Stephen


signature.asc
Description: PGP signature

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-10-18 Thread Andrei Lepikhov


On 19/10/2023 02:00, Stephen Frost wrote:

Greetings,

* Andrei Lepikhov (a.lepik...@postgrespro.ru) wrote:

On 29/9/2023 09:52, Andrei Lepikhov wrote:

On 22/5/2023 22:59, reid.thomp...@crunchydata.com wrote:

Attach patches updated to master.
Pulled from patch 2 back to patch 1 a change that was also pertinent
to patch 1.

+1 to the idea, have doubts on the implementation.

I have a question. I see the feature triggers ERROR on the exceeding of
the memory limit. The superior PG_CATCH() section will handle the error.
As I see, many such sections use memory allocations. What if some
routine, like the CopyErrorData(), exceeds the limit, too? In this case,
we could repeat the error until the top PG_CATCH(). Is this correct
behaviour? Maybe to check in the exceeds_max_total_bkend_mem() for
recursion and allow error handlers to slightly exceed this hard limit?



By the patch in attachment I try to show which sort of problems I'm worrying
about. In some PП_CATCH() sections we do CopyErrorData (allocate some
memory) before aborting the transaction. So, the allocation error can move
us out of this section before aborting. We await for soft ERROR message but
will face more hard consequences.


While it's an interesting idea to consider making exceptions to the
limit, and perhaps we'll do that (or have some kind of 'reserve' for
such cases), this isn't really any different than today, is it?  We
might have a malloc() failure in the main path, end up in PG_CATCH() and
then try to do a CopyErrorData() and have another malloc() failure.

If we can rearrange the code to make this less likely to happen, by
doing a bit more work to free() resources used in the main path before
trying to do new allocations, then, sure, let's go ahead and do that,
but that's independent from this effort.


I agree that rearranging efforts can be made independently. The code in 
the letter above was shown just as a demo of the case I'm worried about.
IMO, the thing that should be implemented here is a recursion level for 
the memory limit. If processing the error, we fall into recursion with 
this limit - we should ignore it.
I imagine custom extensions that use PG_CATCH() and allocate some data 
there. At least we can raise the level of error to FATAL.


--
regards,
Andrey Lepikhov
Postgres Professional

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-10-18 Thread Stephen Frost

Greetings,

* Andrei Lepikhov (a.lepik...@postgrespro.ru) wrote:
> On 29/9/2023 09:52, Andrei Lepikhov wrote:
> > On 22/5/2023 22:59, reid.thomp...@crunchydata.com wrote:
> > > Attach patches updated to master.
> > > Pulled from patch 2 back to patch 1 a change that was also pertinent
> > > to patch 1.
> > +1 to the idea, have doubts on the implementation.
> > 
> > I have a question. I see the feature triggers ERROR on the exceeding of
> > the memory limit. The superior PG_CATCH() section will handle the error.
> > As I see, many such sections use memory allocations. What if some
> > routine, like the CopyErrorData(), exceeds the limit, too? In this case,
> > we could repeat the error until the top PG_CATCH(). Is this correct
> > behaviour? Maybe to check in the exceeds_max_total_bkend_mem() for
> > recursion and allow error handlers to slightly exceed this hard limit?

> By the patch in attachment I try to show which sort of problems I'm worrying
> about. In some PП_CATCH() sections we do CopyErrorData (allocate some
> memory) before aborting the transaction. So, the allocation error can move
> us out of this section before aborting. We await for soft ERROR message but
> will face more hard consequences.

While it's an interesting idea to consider making exceptions to the
limit, and perhaps we'll do that (or have some kind of 'reserve' for
such cases), this isn't really any different than today, is it?  We
might have a malloc() failure in the main path, end up in PG_CATCH() and
then try to do a CopyErrorData() and have another malloc() failure.

If we can rearrange the code to make this less likely to happen, by
doing a bit more work to free() resources used in the main path before
trying to do new allocations, then, sure, let's go ahead and do that,
but that's independent from this effort.

Thanks!

Stephen

signature.asc
Description: PGP signature

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-10-03 Thread Andrei Lepikhov


On 29/9/2023 09:52, Andrei Lepikhov wrote:

On 22/5/2023 22:59, reid.thomp...@crunchydata.com wrote:

Attach patches updated to master.
Pulled from patch 2 back to patch 1 a change that was also pertinent 
to patch 1.

+1 to the idea, have doubts on the implementation.

I have a question. I see the feature triggers ERROR on the exceeding of 
the memory limit. The superior PG_CATCH() section will handle the error. 
As I see, many such sections use memory allocations. What if some 
routine, like the CopyErrorData(), exceeds the limit, too? In this case, 
we could repeat the error until the top PG_CATCH(). Is this correct 
behaviour? Maybe to check in the exceeds_max_total_bkend_mem() for 
recursion and allow error handlers to slightly exceed this hard limit?
By the patch in attachment I try to show which sort of problems I'm 
worrying about. In some PП_CATCH() sections we do CopyErrorData 
(allocate some memory) before aborting the transaction. So, the 
allocation error can move us out of this section before aborting. We 
await for soft ERROR message but will face more hard consequences.


--
regards,
Andrey Lepikhov
Postgres Professional
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 33975687b3..3f992b8d92 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -291,10 +291,7 @@ _SPI_commit(bool chain)
{
ErrorData  *edata;
 
-   /* Save error info in caller's context */
MemoryContextSwitchTo(oldcontext);
-   edata = CopyErrorData();
-   FlushErrorState();
 
/*
 * Abort the failed transaction.  If this fails too, we'll just
@@ -302,6 +299,10 @@ _SPI_commit(bool chain)
 */
AbortCurrentTransaction();
 
+   /* Save error info in caller's context */
+   edata = CopyErrorData();
+   FlushErrorState();
+
/* ... and start a new one */
StartTransactionCommand();
if (chain)
@@ -383,10 +384,7 @@ _SPI_rollback(bool chain)
{
ErrorData  *edata;
 
-   /* Save error info in caller's context */
MemoryContextSwitchTo(oldcontext);
-   edata = CopyErrorData();
-   FlushErrorState();
 
/*
 * Try again to abort the failed transaction.  If this fails 
too,
@@ -395,6 +393,10 @@ _SPI_rollback(bool chain)
 */
AbortCurrentTransaction();
 
+   /* Save error info in caller's context */
+   edata = CopyErrorData();
+   FlushErrorState();
+
/* ... and start a new one */
StartTransactionCommand();
if (chain)
diff --git a/src/backend/replication/logical/reorderbuffer.c 
b/src/backend/replication/logical/reorderbuffer.c
index 12edc5772a..f9cf599026 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -2565,7 +2565,7 @@ ReorderBufferProcessTXN(ReorderBuffer *rb, 
ReorderBufferTXN *txn,
PG_CATCH();
{
MemoryContext ecxt = MemoryContextSwitchTo(ccxt);
-   ErrorData  *errdata = CopyErrorData();
+   ErrorData  *errdata;
 
/* TODO: Encapsulate cleanup from the PG_TRY and PG_CATCH 
blocks */
if (iterstate)
@@ -2579,6 +2579,8 @@ ReorderBufferProcessTXN(ReorderBuffer *rb, 
ReorderBufferTXN *txn,
 */
AbortCurrentTransaction();
 
+   errdata = CopyErrorData();
+
/* make sure there's no cache pollution */
ReorderBufferExecuteInvalidations(txn->ninvalidations,

  txn->invalidations);

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-09-28 Thread Andrei Lepikhov


On 22/5/2023 22:59, reid.thomp...@crunchydata.com wrote:

Attach patches updated to master.
Pulled from patch 2 back to patch 1 a change that was also pertinent to patch 1.

+1 to the idea, have doubts on the implementation.

I have a question. I see the feature triggers ERROR on the exceeding of 
the memory limit. The superior PG_CATCH() section will handle the error. 
As I see, many such sections use memory allocations. What if some 
routine, like the CopyErrorData(), exceeds the limit, too? In this case, 
we could repeat the error until the top PG_CATCH(). Is this correct 
behaviour? Maybe to check in the exceeds_max_total_bkend_mem() for 
recursion and allow error handlers to slightly exceed this hard limit?


Also, the patch needs to be rebased.

--
regards,
Andrey Lepikhov
Postgres Professional

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-06-05 Thread reid . thompson

On Mon, 2023-05-22 at 08:42 -0400, reid.thomp...@crunchydata.com wrote:
 
More followup to the above.
> 
> I experimented on my system regarding
> "The simple query select * from generate_series(0, 1000) shows roughly 
> 18.9 % degradation on my test server."
> 
> My laptop:
> 32GB ram
> 11th Gen Intel(R) Core(TM) i7-11850H 8 cores/16 threads @ 2.50GHz (Max Turbo 
> Frequency. 4.80 GHz ; Cache. 24 MB)
> SSD -> Model: KXG60ZNV1T02 NVMe KIOXIA 1024GB (nvme)

Hi

Ran through a few more tests on my system varying the
initial_allocation_allowance and allocation_allowance_refill_qty from the
current 1MB to 2, 4, 6, 8, 10 mb.  Also realized that in my last tests/email I
had posted percent difference rather than percent change. Turns out for the
numbers that were being compared they're essentially the same, but I'm
providing both for this set of tests.  Ten runs for each comparison. Compared
dev-max-memory set, dev-max-memory unset, master, and 
pg-stat-activity-backend-memory-allocated
against master at each allocation value;

Again, the test invokes
   psql -At -d postgres $connstr -P pager=off -c 'select * from 
generate_series(0, 1000)'
100 times on each of the 2 instances and calculates the AVG time and SD
for the 100 runs.  It then uses the AVG from each instance to calculate
the percentage difference/change.

These tests contain one code change not yet pushed to pgsql-hackers. In
AllocSetReset() do not enter pgstat_report_allocated_bytes_decrease if no
memory has been freed.

Will format and post some pgbench test result in a separate email.

Percent difference:

───┬───
   │ Results: difference-dev-max-memory-set VS master
───┼───
   1   │ 1MB allocation2MB allocation4MB allocation  
6MB allocation8MB allocation10MB allocation
   2   │ 4.2263%   difference  3.03961%  difference  0.0585808%  difference  
2.92451%  difference  3.34694%  difference  2.67771%  difference
   3   │ 3.55709%  difference  3.92339%  difference  2.29144%difference  
3.2156%   difference  2.06153%  difference  2.86217%  difference
   4   │ 2.04389%  difference  2.91866%  difference  3.73463%difference  
2.86161%  difference  3.60992%  difference  3.07293%  difference
   5   │ 3.1306%   difference  3.64773%  difference  2.38063%difference  
1.84845%  difference  4.87375%  difference  4.16953%  difference
   6   │ 3.12556%  difference  3.34537%  difference  2.99052%difference  
2.60538%  difference  2.14825%  difference  1.95454%  difference
   7   │ 2.20615%  difference  2.12861%  difference  2.85282%difference  
2.43336%  difference  2.31389%  difference  3.21563%  difference
   8   │ 1.9954%   difference  3.61371%  difference  3.35543%difference  
3.49821%  difference  3.41526%  difference  8.25753%  difference
   9   │ 2.46845%  difference  2.57784%  difference  3.13067%difference  
3.67681%  difference  2.89139%  difference  3.6067%   difference
  10   │ 3.60092%  difference  2.16164%  difference  3.9976% difference  
2.6144%   difference  4.27892%  difference  2.68998%  difference
  11   │ 2.55454%  difference  2.39073%  difference  3.09631%difference  
3.24292%  difference  1.9107%   difference  1.76182%  difference
  12   │
  13   │ 28.9089/1029.74729/10   27.888631/10
28.92125/10   30.85055/10   34.26854/10
  14   │ 2.89089   2.974729  2.7888631   
2.892125  3.085055  3.426854
───┴───
───┬───
   │ Results: difference-dev-max-memory-unset VS master
───┼───
   1   │ 1MB allocation2MB allocation4MB allocation  
6MB allocation8MB allocation10MB allocation
   2   │ 3.96616%  difference  3.05528%  difference  0.563267%   difference  
1.12075%  difference  3.52398%  difference  3.25641%   difference
   3   │ 3.11387%  difference  3.12499%  difference  1.1133% difference  
4.86997%  difference  2.11481%  difference  1.11668%   difference
   4   │ 3.14506%  difference  2.06193%  difference  3.36034%difference  
2.80644%  difference  2.37822%  difference  3.07669%   difference
   5   │ 2.81052%  difference  3.18499%  difference  2.70705%difference  
2.27847%  difference  2.78506%

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-05-22 Thread reid . thompson

On Mon, 2023-05-22 at 08:42 -0400, reid.thomp...@crunchydata.com wrote:
> On Wed, 2023-05-17 at 23:07 -0400, reid.thomp...@crunchydata.com wrote:
> > Thanks for the feedback.
> > 
> > I'm plannning to look at this. 
> > 
> > Is your benchmark something that I could utilize? I.E. is it a set of
> > scripts or a standard test from somewhere that I can duplicate?
> > 
> > Thanks,
> > Reid
> > 

Attach patches updated to master.
Pulled from patch 2 back to patch 1 a change that was also pertinent to patch 1.

From e6f8499e0270f2291494260bc341e8ad1411c2ae Mon Sep 17 00:00:00 2001
From: Reid Thompson 
Date: Thu, 11 Aug 2022 12:01:25 -0400
Subject: [PATCH 1/2] Add tracking of backend memory allocated

Add tracking of backend memory allocated in total and by allocation
type (aset, dsm, generation, slab) by process.

allocated_bytes tracks the current bytes of memory allocated to the
backend process. aset_allocated_bytes, dsm_allocated_bytes,
generation_allocated_bytes and slab_allocated_bytes track the
allocation by type for the backend process. They are updated for the
process as memory is malloc'd/freed.  Memory allocated to items on
the freelist is included.  Dynamic shared memory allocations are
included only in the value displayed for the backend that created
them, they are not included in the value for backends that are
attached to them to avoid double counting. DSM allocations that are
not destroyed by the creating process prior to it's exit are
considered long lived and are tracked in a global counter
global_dsm_allocated_bytes. We limit the floor of allocation
counters to zero. Created views pg_stat_global_memory_allocation and
pg_stat_memory_allocation for access to these trackers.
---
 doc/src/sgml/monitoring.sgml| 246 
 src/backend/catalog/system_views.sql|  34 +++
 src/backend/storage/ipc/dsm.c   |  11 +-
 src/backend/storage/ipc/dsm_impl.c  |  78 +++
 src/backend/storage/lmgr/proc.c |   1 +
 src/backend/utils/activity/backend_status.c | 114 +
 src/backend/utils/adt/pgstatfuncs.c |  84 +++
 src/backend/utils/init/miscinit.c   |   3 +
 src/backend/utils/mmgr/aset.c   |  17 ++
 src/backend/utils/mmgr/generation.c |  15 ++
 src/backend/utils/mmgr/slab.c   |  22 ++
 src/include/catalog/pg_proc.dat |  17 ++
 src/include/storage/proc.h  |   2 +
 src/include/utils/backend_status.h  | 144 +++-
 src/test/regress/expected/rules.out |  27 +++
 src/test/regress/expected/stats.out |  36 +++
 src/test/regress/sql/stats.sql  |  20 ++
 17 files changed, 869 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index df5242fa80..cfc221fb2e 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -5757,6 +5757,252 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
 
  
 
+ 
+  pg_stat_memory_allocation
+
+  
+   pg_stat_memory_allocation
+  
+
+  
+   The pg_stat_memory_allocation view will have one
+   row per server process, showing information related to the current memory
+   allocation of that process in total and by allocator type. Due to the
+   dynamic nature of memory allocations the allocated bytes values may not be
+   exact but should be sufficient for the intended purposes. Dynamic shared
+   memory allocations are included only in the value displayed for the backend
+   that created them, they are not included in the value for backends that are
+   attached to them to avoid double counting.  Use
+   pg_size_pretty described in
+to make these values more easily
+   readable.
+  
+
+  
+   pg_stat_memory_allocation View
+   
+
+ 
+  
+   Column Type
+  
+  
+   Description
+  
+ 
+
+
+
+ 
+  
+   datid oid
+  
+  
+   OID of the database this backend is connected to
+  
+ 
+
+ 
+  
+   pid integer
+  
+  
+   Process ID of this backend
+  
+ 
+
+ 
+  
+   allocated_bytes bigint
+  
+ 
+  Memory currently allocated to this backend in bytes. This is the balance
+  of bytes allocated and freed by this backend. Dynamic shared memory
+  allocations are included only in the value displayed for the backend that
+  created them, they are not included in the value for backends that are
+  attached to them to avoid double counting.
+ 
+ 
+
+ 
+  
+   aset_allocated_bytes bigint
+  
+  
+   Memory currently allocated to this backend in bytes via the allocation
+   set allocator.
+  
+ 
+
+ 
+  
+   dsm_allocated_bytes bigint
+  
+  
+   Memory currently allocated to this backend in bytes via the dynamic
+   shared memory allocator. Upon process exit, dsm allocations that have
+

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-05-22 Thread reid . thompson

On Wed, 2023-05-17 at 23:07 -0400, reid.thomp...@crunchydata.com wrote:
> Thanks for the feedback.
> 
> I'm plannning to look at this. 
> 
> Is your benchmark something that I could utilize? I.E. is it a set of
> scripts or a standard test from somewhere that I can duplicate?
> 
> Thanks,
> Reid
> 
Hi Arne,

Followup to the above.

I experimented on my system regarding
"The simple query select * from generate_series(0, 1000) shows roughly 18.9 
% degradation on my test server."

My laptop:
32GB ram
11th Gen Intel(R) Core(TM) i7-11850H 8 cores/16 threads @ 2.50GHz (Max Turbo 
Frequency. 4.80 GHz ; Cache. 24 MB)
SSD -> Model: KXG60ZNV1T02 NVMe KIOXIA 1024GB (nvme)

I updated to latest master and rebased my patch branches.

I wrote a script to check out, build, install, init, and startup
master, patch 1, patch 1+2, patch 1+2 as master, pg-stats-memory, 
dev-max-memory, and dev-max-memory-unset configured with

../../configure --silent 
--prefix=/home/rthompso/src/git/postgres/install/${dir} --with-openssl 
--with-tcl --with-tclconfig=/usr/lib/tcl8.6 --with-perl --with-libxml 
--with-libxslt --with-python --with-gssapi --with-systemd --with-ldap 
--enable-nls

where $dir in  master, pg-stats-memory, and dev-max-memory,
dev-max-memory-unset.

The only change made to the default postgresql.conf was to have the
script add to the dev-max-memory instance the line
"max_total_backend_memory = 2048" before startup.
I did find one change in patch 2 that I pushed back into patch 1, this
should only impact the pg-stats-memory instance.

my .psqlrc turns timing on

I created a script where I can pass two instances to be compared.
It invokes
   psql -At -d postgres $connstr -P pager=off -c 'select * from 
generate_series(0, 1000)'
100 times on each of the 2 instances and calculates the AVG time and SD
for the 100 runs.  It then uses the AVG from each instance to calculate
the percentage difference.

Depending on the instance, my results differ from master from
negligible to ~5.5%.  Comparing master to itself had up to a ~2%
variation. See below.


12 runs comparing dev-max-memory 2048 VS master
Shows ~3% to 5.5% variation

Calculate average runtime percentage difference between VER dev-max-memory 2048 
and VER master
1307.14 -> VER dev-max-memory 2048
1240.74 -> VER master
5.21218% difference
--
Calculate average runtime percentage difference between VER dev-max-memory 2048 
and VER master
1315.99 -> VER dev-max-memory 2048
1245.64 -> VER master
5.4926% difference
--
Calculate average runtime percentage difference between VER dev-max-memory 2048 
and VER master
1317.39 -> VER dev-max-memory 2048
1265.33 -> VER master
4.03141% difference
--
Calculate average runtime percentage difference between VER dev-max-memory 2048 
and VER master
1313.52 -> VER dev-max-memory 2048
1256.69 -> VER master
4.42221% difference
--
Calculate average runtime percentage difference between VER dev-max-memory 2048 
and VER master
1329.98 -> VER dev-max-memory 2048
1253.75 -> VER master
5.90077% difference
--
Calculate average runtime percentage difference between VER dev-max-memory 2048 
and VER master
1314.47 -> VER dev-max-memory 2048
1245.6 -> VER master
5.38032% difference
--
Calculate average runtime percentage difference between VER dev-max-memory 2048 
and VER master
1309.7 -> VER dev-max-memory 2048
1258.55 -> VER master
3.98326% difference
--
Calculate average runtime percentage difference between VER dev-max-memory 2048 
and VER master
1322.16 -> VER dev-max-memory 2048
1248.94 -> VER master
5.69562% difference
--
Calculate average runtime percentage difference between VER dev-max-memory 2048 
and VER master
1320.15 -> VER dev-max-memory 2048
1261.41 -> VER master
4.55074% difference
--
Calculate average runtime percentage difference between VER dev-max-memory 2048 
and VER master
1345.22 -> VER dev-max-memory 2048
1280.96 -> VER master
4.8938% difference
--
Calculate average runtime percentage difference between VER dev-max-memory 2048 
and VER master
1296.03 -> VER dev-max-memory 2048
1257.06 -> VER master
3.05277% difference
--
Calculate average runtime percentage difference between VER dev-max-memory 2048 
and VER master
1319.5 -> VER dev-max-memory 2048
1252.34 -> VER master
5.22272% difference


12 showing dev-max-memory-unset VS master
Shows ~2.5% to 5% variation

Calculate average runtime percentage difference between VER dev-max-memory 
unset and VER master
1300.93 -> VER dev-max-memory unset
1235.12 -> VER master
5.18996% difference
--
Calculate average runtime percentage difference between VER dev-max-memory 
unset and VER master
1293.57 -> VER dev-max-memory unset
1263.93 -> VER master
2.31789% difference
--
Calculate average runtime percentage difference between VER dev-max-memory 
unset and VER master
1303.05 -> VER dev-max-memory unset
1258.11 -> VER master
3.50935% difference
--
Calculate average runtime percentage difference between VER dev-max-memory 
unset

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-05-17 Thread reid . thompson

On Wed, 2023-04-19 at 23:28 +, Arne Roland wrote:
> > Thank you! I just tried our benchmark and got a performance
> > degration > of around 28 %, which is way better than the last
> > patch.
> > 
> > The simple query select * from generate_series(0, 1000) shows >
> > roughly 18.9 % degradation on my test server.
> > 
> > By raising initial_allocation_allowance and >
> > allocation_allowance_refill_qty I can get it to 16 % degradation.
> > So > most of the degradation seems to be independent from raising
> > the > allowance.
> > 
> > I think we probably should investigate this further.
> > 
> > Regards
> > Arne
> > 

Hi Arne,

Thanks for the feedback.

I'm plannning to look at this. 

Is your benchmark something that I could utilize? I.E. is it a set of
scripts or a standard test from somewhere that I can duplicate?

Thanks,
Reid

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-04-19 Thread Arne Roland

Thank you! I just tried our benchmark and got a performance degration of around 
28 %, which is way better than the last patch.

The simple query select * from generate_series(0, 1000) shows roughly 18.9 
% degradation on my test server.

By raising initial_allocation_allowance and allocation_allowance_refill_qty I 
can get it to 16 % degradation. So most of the degradation seems to be 
independent from raising the allowance.

I think we probably should investigate this further.

Regards
Arne

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-04-06 Thread reid . thompson

Updated patches attached. 

Rebased to current master.
Added additional columns to pg_stat_global_memory_allocation to summarize 
backend allocations by type.
Updated documentation.
Corrected some issues noted in review by John Morris.
Added code re EXEC_BACKEND for dev-max-memory branch.
From 34514ae2bebe5e3ab2a0b5b680d3932b5e7706ee Mon Sep 17 00:00:00 2001
From: Reid Thompson 
Date: Sat, 4 Jun 2022 22:23:59 -0400
Subject: [PATCH 2/2] Add the ability to limit the amount of memory that can be
 allocated to backends.

This builds on the work that adds backend memory allocated tracking.

Add GUC variable max_total_backend_memory.

Specifies a limit to the amount of memory (in MB) that may be allocated to
backends in total (i.e. this is not a per user or per backend limit). If unset,
or set to 0 it is disabled. It is intended as a resource to help avoid the OOM
killer on LINUX and manage resources in general. A backend request that would
exhaust max_total_backend_memory memory will be denied with an out of memory
error causing that backend's current query/transaction to fail.  Further
requests will not be allocated until dropping below the limit. Keep this in
mind when setting this value. Due to the dynamic nature of memory allocations,
this limit is not exact. This limit does not affect auxiliary backend
processes. Backend memory allocations are displayed in the
pg_stat_memory_allocation and pg_stat_global_memory_allocation views.
---
 doc/src/sgml/config.sgml  |  30 +++
 doc/src/sgml/monitoring.sgml  |  38 +++-
 src/backend/catalog/system_views.sql  |   2 +
 src/backend/port/sysv_shmem.c |   9 +
 src/backend/postmaster/postmaster.c   |   5 +
 src/backend/storage/ipc/dsm_impl.c|  18 ++
 src/backend/storage/lmgr/proc.c   |  45 +
 src/backend/utils/activity/backend_status.c   | 183 ++
 src/backend/utils/adt/pgstatfuncs.c   |  16 +-
 src/backend/utils/hash/dynahash.c |   3 +-
 src/backend/utils/init/miscinit.c |   8 +
 src/backend/utils/misc/guc_tables.c   |  11 ++
 src/backend/utils/misc/postgresql.conf.sample |   3 +
 src/backend/utils/mmgr/aset.c |  33 
 src/backend/utils/mmgr/generation.c   |  16 ++
 src/backend/utils/mmgr/slab.c |  15 +-
 src/include/catalog/pg_proc.dat   |   6 +-
 src/include/storage/proc.h|   7 +
 src/include/utils/backend_status.h| 120 ++--
 src/test/regress/expected/rules.out   |   4 +-
 20 files changed, 537 insertions(+), 35 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index bcc49aec45..4c735e180f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2133,6 +2133,36 @@ include_dir 'conf.d'
   
  
 
+ 
+  max_total_backend_memory (integer)
+  
+   max_total_backend_memory configuration parameter
+  
+  
+  
+   
+Specifies a limit to the amount of memory (MB) that may be allocated to
+backends in total (i.e. this is not a per user or per backend limit).
+If unset, or set to 0 it is disabled.  At databse startup
+max_total_backend_memory is reduced by shared_memory_size_mb
+(includes shared buffers and other memory required for initialization).
+Each backend process is intialized with a 1MB local allowance which
+also reduces max_total_bkend_mem_bytes_available. Keep this in mind
+when setting this value. A backend request that would exhaust the limit
+will be denied with an out of memory error causing that backend's
+current query/transaction to fail. Further requests will not be
+allocated until dropping below the limit.  This limit does not affect
+auxiliary backend processes
+ or the postmaster process.
+Backend memory allocations (allocated_bytes) are
+displayed in the
+pg_stat_memory_allocation
+view.  Due to the dynamic nature of memory allocations, this limit is
+not exact.
+   
+  
+ 
+
  
  
 
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 70b3441412..704a75bd6e 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -5704,10 +5704,7 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
   
  
   Memory currently allocated to this backend in bytes. This is the balance
-  of bytes allocated and freed by this backend. Dynamic shared memory
-  allocations are included only in the value displayed for the backend that
-  created them, they are not included in the value for backends that are
-  attached to them to avoid double counting.
+  of bytes allocated and freed by this backend.
  
  
 
@@ -5824,6 +5821,39 @@ SELECT pid, wait_event_type, wait_event

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-03-24 Thread reid . thompson

Updated patches attached. 



pg-stat-activity-backend-memory-allocated

DSM allocations created by a process and not destroyed prior to it's exit are
considered long lived and are tracked in global_dsm_allocated_bytes.

created 2 new system views (see below):

pg_stat_global_memory_allocation view displays datid, shared_memory_size,
shared_memory_size_in_huge_pages, global_dsm_allocated_bytes. shared_memory_size
and shared_memory_size_in_huge_pages display the calculated read only values for
these GUCs.

pg_stat_memory_allocation view
Migrated allocated_bytes out of pg_stat_activity view into this view.
pg_stat_memory_allocation also contains a breakdown of allocation by allocator
type (aset, dsm, generation, slab). View displays datid, pid, allocated_bytes,
aset_allocated_bytes, dsm_allocated_bytes, generation_allocated_bytes,
slab_allocated_bytes by process.

Reduced calls to initialize allocation counters by moving
intialization call into InitPostmasterChild.

postgres=# select * from  pg_stat_global_memory_allocation;
 datid | shared_memory_size | shared_memory_size_in_huge_pages | 
global_dsm_allocated_bytes
---++--+
 5 | 192MB  |   96 |
1048576
(1 row)


postgres=# select * from  pg_stat_memory_allocation;
 datid |  pid   | allocated_bytes | aset_allocated_bytes | dsm_allocated_bytes 
| generation_allocated_bytes | slab_allocated_bytes
---++-+--+-++--
   | 981842 |  771512 |   771512 |   0 
|  0 |0
   | 981843 |  736696 |   736696 |   0 
|  0 |0
 5 | 981913 | 4274792 |  4274792 |   0 
|  0 |0
   | 981838 |  107216 |   107216 |   0 
|  0 |0
   | 981837 |  123600 |   123600 |   0 
|  0 |0
   | 981841 |  107216 |   107216 |   0 
|  0 |0
(6 rows)

postgres=# select ps.datid, ps.pid, state,application_name,backend_type, pa.* 
from pg_stat_activity ps join pg_stat_memory_allocation pa on (pa.pid = ps.pid) 
order by  dsm_allocated_bytes, pa.pid;
 datid |  pid   | state  | application_name | backend_type | 
datid |  pid   | allocated_bytes | aset_allocated_bytes | dsm_allocated_bytes | 
generation_allocated_bytes | slab_allocated_bytes
---+++--+--+---++-+--+-++--
   | 981837 ||  | checkpointer |
   | 981837 |  123600 |   123600 |   0 |
  0 |0
   | 981838 ||  | background writer|
   | 981838 |  107216 |   107216 |   0 |
  0 |0
   | 981841 ||  | walwriter|
   | 981841 |  107216 |   107216 |   0 |
  0 |0
   | 981842 ||  | autovacuum launcher  |
   | 981842 |  771512 |   771512 |   0 |
  0 |0
   | 981843 ||  | logical replication launcher |
   | 981843 |  736696 |   736696 |   0 |
  0 |0
 5 | 981913 | active | psql | client backend   |
 5 | 981913 | 5390864 |  5382824 |   0 |
   8040 |0
(6 rows)





dev-max-memory

Include shared_memory_size in max_total_backend_memory calculations.
max_total_backend_memory is reduced by shared_memory_size at startup.
Local allowance is refilled when consumed from global
max_total_bkend_mem_bytes_available.

pg_stat_global_memory_allocation view
add columns max_total_backend_memory_bytes, max_total_bkend_mem_bytes_available.

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-03-02 Thread Andres Freund

On 2023-03-02 14:41:26 -0500, reid.thomp...@crunchydata.com wrote:
> Patch has been rebased to master.

Quite a few prior review comments seem to not have been
addressed. There's not much point in posting new versions without that.

I think there's zero chance 0002 can make it into 16. If 0001 is cleaned
up, I can see a path.

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-03-02 Thread reid . thompson

On Mon, 2023-02-13 at 16:26 -0800, Andres Freund wrote:
> Hi,
> 
> The tests recently started to fail:
> 
> https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest%2F42%2F3867
> 
> I marked this as waiting on author.
> 
> Greetings,
> 
> Andres Freund

Patch has been rebased to master.

The memory limiting portion (patch 0002-*) has been refactored to utilize a
shared counter for total memory allocation along with backend-local
allowances that are initialized at process startup and refilled from the
central counter upon being used up. Free'd memory is accumulated and
returned to the shared counter upon meeting a threshold and/or upon process
exit. At this point arbitrarily picked 1MB as the initial allowance and
return threshold. 

Thanks,
Reid






From e044bacedab503d1cd732146e1b9947406191bb6 Mon Sep 17 00:00:00 2001
From: Reid Thompson 
Date: Sat, 4 Jun 2022 22:23:59 -0400
Subject: [PATCH 2/2] Add the ability to limit the amount of memory that can be
 allocated to backends.

This builds on the work that adds backend memory allocated to pg_stat_activity.

Add GUC variable max_total_backend_memory.

Specifies a limit to the amount of memory (in MB) that may be allocated to
backends in total (i.e. this is not a per user or per backend limit). If unset,
or set to 0 it is disabled. It is intended as a resource to help avoid the OOM
killer on LINUX and manage resources in general. A backend request that would
push the total over the limit will be denied with an out of memory error causing
that backend's current query/transaction to fail. Due to the dynamic nature of
memory allocations, this limit is not exact. If within 1.5MB of the limit and
two backends request 1MB each at the same time both may be allocated, and exceed
the limit. Further requests will not be allocated until dropping below the
limit. Keep this in mind when setting this value. This limit does not affect
auxiliary backend processes. Backend memory allocations are displayed in the
pg_stat_activity view.
---
 doc/src/sgml/config.sgml  |  26 ++
 src/backend/postmaster/autovacuum.c   |   8 +-
 src/backend/postmaster/postmaster.c   |  17 +-
 src/backend/postmaster/syslogger.c|   4 +-
 src/backend/storage/ipc/dsm_impl.c|  35 ++-
 src/backend/storage/lmgr/proc.c   |   3 +
 src/backend/utils/activity/backend_status.c   | 223 +-
 src/backend/utils/misc/guc_tables.c   |  11 +
 src/backend/utils/misc/postgresql.conf.sample |   3 +
 src/backend/utils/mmgr/aset.c |  43 +++-
 src/backend/utils/mmgr/generation.c   |  21 +-
 src/backend/utils/mmgr/slab.c |  21 +-
 src/include/storage/proc.h|   7 +
 src/include/utils/backend_status.h| 222 +++--
 14 files changed, 560 insertions(+), 84 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e5c41cc6c6..1bff68b1ec 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2113,6 +2113,32 @@ include_dir 'conf.d'
   
  
 
+ 
+  max_total_backend_memory (integer)
+  
+   max_total_backend_memory configuration parameter
+  
+  
+  
+   
+Specifies a limit to the amount of memory (MB) that may be allocated to
+backends in total (i.e. this is not a per user or per backend limit).
+If unset, or set to 0 it is disabled.  A backend request that would
+push the total over the limit will be denied with an out of memory
+error causing that backend's current query/transaction to fail. Due to
+the dynamic nature of memory allocations, this limit is not exact. If
+within 1.5MB of the limit and two backends request 1MB each at the same
+time both may be allocated, and exceed the limit. Further requests will
+not be allocated until dropping below the limit. Keep this in mind when
+setting this value. This limit does not affect auxiliary backend
+processes  . Backend memory
+allocations (allocated_bytes) are displayed in the
+pg_stat_activity
+view.
+   
+  
+ 
+
  
  
 
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 59c9bcf8c4..ee03d08dd9 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -407,8 +407,8 @@ StartAutoVacLauncher(void)
 
 #ifndef EXEC_BACKEND
 		case 0:
-			/* Zero allocated bytes to avoid double counting parent allocation */
-			pgstat_zero_my_allocated_bytes();
+			/* Init allocated bytes to avoid double counting parent allocation */
+			pgstat_init_allocated_bytes();
 
 			/* in postmaster child ... */
 			InitPostmasterChild();
@@ -1488,8 +1488,8 @@ StartAutoVacWorker(void)
 
 #ifndef EXEC_BACKEND
 		case 0:
-			/* Zero allocated bytes to avoid double counting parent allocation */
-			pgstat_zero_my_allocated_bytes();
+

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-02-13 Thread Andres Freund

Hi,

On 2023-01-26 15:27:20 -0500, Reid Thompson wrote:
> Yes, just a rebase. There is still work to be done per earlier in the
> thread.

The tests recently started to fail:

https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest%2F42%2F3867


> I do want to follow up and note re palloc/pfree vs malloc/free that the
> tracking code (0001-Add-tracking-...) is not tracking palloc/pfree but is
> explicitely tracking malloc/free. Not every palloc/pfree call executes the
> tracking code, only those where the path followed includes malloc() or
> free().  Routine palloc() calls fulfilled from the context's
> freelist/emptyblocks/freeblock/etc and pfree() calls not invoking free()
> avoid the tracking code.

Sure, but we create a lot of memory contexts, so that's not a whole lot of
comfort.


I marked this as waiting on author.

Greetings,

Andres Freund

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-02-02 Thread Reid Thompson

Regarding the shared counter noted here,

> What you could do is to have a single, imprecise, shared counter for the total
> memory allocation, and have a backend-local "allowance". When the allowance is
> used up, refill it from the shared counter (a single atomic op).

Is there a preferred or suggested location to put variables like this?
Perhaps a current variable to use as a reference?

Thanks,
Reid

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-01-26 Thread Reid Thompson

On Mon, 2023-01-23 at 12:31 -0800, Andres Freund wrote:
> Hi,
> 
> I think it's basically still waiting on author, until the O(N) cost is gone
> from the overflow limit check.
> 
> Greetings,
> 
> Andres Freund

Yes, just a rebase. There is still work to be done per earlier in the
thread.

I do want to follow up and note re palloc/pfree vs malloc/free that the
tracking code (0001-Add-tracking-...) is not tracking palloc/pfree but is
explicitely tracking malloc/free. Not every palloc/pfree call executes the
tracking code, only those where the path followed includes malloc() or
free().  Routine palloc() calls fulfilled from the context's
freelist/emptyblocks/freeblock/etc and pfree() calls not invoking free()
avoid the tracking code.

Thanks,
Reid

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-01-23 Thread Andres Freund

Hi,

On 2023-01-23 10:48:38 -0500, Reid Thompson wrote:
> On Thu, 2023-01-19 at 16:50 +0530, vignesh C wrote:
> > 
> > The patch does not apply on top of HEAD as in [1], please post a rebased 
> > patch:
> > 
> > Regards,
> > Vignesh
> 
> rebased patch attached

I think it's basically still waiting on author, until the O(N) cost is gone
from the overflow limit check.

Greetings,

Andres Freund

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-01-23 Thread Reid Thompson

On Thu, 2023-01-19 at 16:50 +0530, vignesh C wrote:
> 
> The patch does not apply on top of HEAD as in [1], please post a rebased 
> patch:
> 
> Regards,
> Vignesh

rebased patch attached

Thanks,
Reid

From b32a346d6e0e00c568e9a285ad15fc2703998c26 Mon Sep 17 00:00:00 2001
From: Reid Thompson 
Date: Sat, 4 Jun 2022 22:23:59 -0400
Subject: [PATCH 2/2] Add the ability to limit the amount of memory that can be
 allocated to backends.

This builds on the work that adds backend memory allocated to pg_stat_activity.

Add GUC variable max_total_backend_memory.

Specifies a limit to the amount of memory (in MB) that may be allocated to
backends in total (i.e. this is not a per user or per backend limit). If unset,
or set to 0 it is disabled. It is intended as a resource to help avoid the OOM
killer on LINUX and manage resources in general. A backend request that would
push the total over the limit will be denied with an out of memory error causing
that backend's current query/transaction to fail. Due to the dynamic nature of
memory allocations, this limit is not exact. If within 1.5MB of the limit and
two backends request 1MB each at the same time both may be allocated, and exceed
the limit. Further requests will not be allocated until dropping below the
limit. Keep this in mind when setting this value. This limit does not affect
auxiliary backend processes. Backend memory allocations are displayed in the
pg_stat_activity view.
---
 doc/src/sgml/config.sgml  |  26 +
 src/backend/storage/ipc/dsm_impl.c|  12 ++
 src/backend/utils/activity/backend_status.c   | 108 ++
 src/backend/utils/misc/guc_tables.c   |  11 ++
 src/backend/utils/misc/postgresql.conf.sample |   3 +
 src/backend/utils/mmgr/aset.c |  17 +++
 src/backend/utils/mmgr/generation.c   |   9 ++
 src/backend/utils/mmgr/slab.c |   9 +-
 src/include/utils/backend_status.h|   3 +
 9 files changed, 197 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index f985afc009..51ed4623be 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2113,6 +2113,32 @@ include_dir 'conf.d'
   
  
 
+ 
+  max_total_backend_memory (integer)
+  
+   max_total_backend_memory configuration parameter
+  
+  
+  
+   
+Specifies a limit to the amount of memory (MB) that may be allocated to
+backends in total (i.e. this is not a per user or per backend limit).
+If unset, or set to 0 it is disabled.  A backend request that would
+push the total over the limit will be denied with an out of memory
+error causing that backend's current query/transaction to fail. Due to
+the dynamic nature of memory allocations, this limit is not exact. If
+within 1.5MB of the limit and two backends request 1MB each at the same
+time both may be allocated, and exceed the limit. Further requests will
+not be allocated until dropping below the limit. Keep this in mind when
+setting this value. This limit does not affect auxiliary backend
+processes  . Backend memory
+allocations (allocated_bytes) are displayed in the
+pg_stat_activity
+view.
+   
+  
+ 
+
  
  
 
diff --git a/src/backend/storage/ipc/dsm_impl.c b/src/backend/storage/ipc/dsm_impl.c
index 22885c7bd2..f7047107d5 100644
--- a/src/backend/storage/ipc/dsm_impl.c
+++ b/src/backend/storage/ipc/dsm_impl.c
@@ -254,6 +254,10 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size,
 		return true;
 	}
 
+	/* Do not exceed maximum allowed memory allocation */
+	if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size))
+		return false;
+
 	/*
 	 * Create new segment or open an existing one for attach.
 	 *
@@ -525,6 +529,10 @@ dsm_impl_sysv(dsm_op op, dsm_handle handle, Size request_size,
 		int			flags = IPCProtection;
 		size_t		segsize;
 
+		/* Do not exceed maximum allowed memory allocation */
+		if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size))
+			return false;
+
 		/*
 		 * Allocate the memory BEFORE acquiring the resource, so that we don't
 		 * leak the resource if memory allocation fails.
@@ -719,6 +727,10 @@ dsm_impl_windows(dsm_op op, dsm_handle handle, Size request_size,
 		return true;
 	}
 
+	/* Do not exceed maximum allowed memory allocation */
+	if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size))
+		return false;
+
 	/* Create new segment or open an existing one for attach. */
 	if (op == DSM_OP_CREATE)
 	{
diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c
index 7baf2db57d..da2b5fb042 100644
--- a/src/backend/utils/activity/backend_status.c
+++ b/src/backend/utils/activity/backend_status.c
@@ -45,6 +45,9 @@
 bool		pgstat_track_activities = false;
 int			pgstat_track_activity_query_size =

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-01-19 Thread vignesh C

On Fri, 6 Jan 2023 at 00:19, Reid Thompson
 wrote:
>
> On Tue, 2023-01-03 at 16:22 +0530, vignesh C wrote:
> > 
> > The patch does not apply on top of HEAD as in [1], please post a
> > rebased patch:
> > ...
> > Regards,
> > Vignesh
> >
>
> Attached is rebased patch, with some updates related to committed changes.

The patch does not apply on top of HEAD as in [1], please post a rebased patch:
=== Applying patches on top of PostgreSQL commit ID
48880840f18cb75fcaecc77b5e7816b92c27157b ===
=== applying patch
./0001-Add-tracking-of-backend-memory-allocated-to-pg_stat_.patch

patching file src/test/regress/expected/rules.out
Hunk #2 FAILED at 1875.
Hunk #4 FAILED at 2090.
2 out of 4 hunks FAILED -- saving rejects to file
src/test/regress/expected/rules.out.rej

[1] - http://cfbot.cputube.org/patch_41_3867.log

Regards,
Vignesh

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-01-13 Thread Andres Freund

Hi,

On 2023-01-13 09:15:10 -0500, Reid Thompson wrote:
> On Mon, 2023-01-09 at 18:31 -0800, Andres Freund wrote:
> > > Dynamic shared memory allocations are included only in the value displayed
> > > for the backend that created them, they are not included in the value for
> > > backends that are attached to them to avoid double counting.
> >
> > As mentioned before, I don't think accounting DSM this way makes sense.
>
> Understood, previously you noted 'There are a few uses of DSMs that track
> shared resources, with the biggest likely being the stats for relations
> etc'.  I'd like to come up with a solution to address this; identifying the
> long term allocations to shared state and accounting for them such that they
> don't get 'lost' when the allocating backend exits.  Any guidance or
> direction would be appreciated.

Tracking it as backend memory usage doesn't seem helpful to me, particularly
because some of it is for server wide data tracking (pgstats, some
caches). But that doesn't mean you couldn't track and report it
separately.

> > > +/* --
> > > + * pgstat_get_all_memory_allocated() -
> > > + *
> > > + *   Return a uint64 representing the current shared memory 
> > > allocated to all
> > > + *   backends.  This looks directly at the BackendStatusArray, and 
> > > so will
> > > + *   provide current information regardless of the age of our 
> > > transaction's
> > > + *   snapshot of the status array.
> > > + *   In the future we will likely utilize additional values - 
> > > perhaps limit
> > > + *   backend allocation by user/role, etc.
> > > + * --
> >
> > I think it's completely unfeasible to execute something as expensive as
> > pgstat_get_all_backend_memory_allocated() on every allocation. Like,
> > seriously, no.
>
> Ok. Do we check every nth allocation/try to implement a scheme of checking
> more often as we we get closer to the declared max_total_bkend_mem?

I think it's just not acceptable to do O(connections) work as part of
something critical as memory allocation. Even if amortized imo.

What you could do is to have a single, imprecise, shared counter for the total
memory allocation, and have a backend-local "allowance". When the allowance is
used up, refill it from the shared counter (a single atomic op).

But honestly, I think we first need to have the accounting for a while before
it makes sense to go for the memory limiting patch. And I doubt a single GUC
will suffice to make this usable.

> > And we absolutely definitely shouldn't just add CHECK_FOR_INTERRUPT() calls
> > into the middle of allocator code.
>
> I'm open to guidance/suggestions/pointers to remedying these.

Well, just don't have the CHECK_FOR_INTERRUPT(). Nor the O(N) operation.

You also can't do the ereport(WARNING) there, that itself allocates memory,
and could lead to recursion in some edge cases.

Greetings,

Andres Freund

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-01-13 Thread Reid Thompson

On Mon, 2023-01-09 at 18:31 -0800, Andres Freund wrote:
> Hi,
> 
> On 2023-01-05 13:44:20 -0500, Reid Thompson wrote:
> > This new field displays the current bytes of memory allocated to the
> > backend process. It is updated as memory for the process is
> > malloc'd/free'd. Memory allocated to items on the freelist is included in
> > the displayed value.
> 
> It doesn't actually malloc/free. It tracks palloc/pfree.

I will update the message

> 
> > Dynamic shared memory allocations are included only in the value displayed
> > for the backend that created them, they are not included in the value for
> > backends that are attached to them to avoid double counting.
> 
> As mentioned before, I don't think accounting DSM this way makes sense.

Understood, previously you noted 'There are a few uses of DSMs that track
shared resources, with the biggest likely being the stats for relations
etc'.  I'd like to come up with a solution to address this; identifying the
long term allocations to shared state and accounting for them such that they
don't get 'lost' when the allocating backend exits.  Any guidance or
direction would be appreciated. 

> > --- a/src/backend/postmaster/autovacuum.c
> > +++ b/src/backend/postmaster/autovacuum.c
> > @@ -407,6 +407,9 @@ StartAutoVacLauncher(void)
> >  
> >  #ifndef EXEC_BACKEND
> > case 0:
> > +   /* Zero allocated bytes to avoid double counting parent 
> > allocation */
> > +   pgstat_zero_my_allocated_bytes();
> > +
> > /* in postmaster child ... */
> > InitPostmasterChild();
> 
> 
> 
> > @@ -1485,6 +1488,9 @@ StartAutoVacWorker(void)
> >  
> >  #ifndef EXEC_BACKEND
> > case 0:
> > +   /* Zero allocated bytes to avoid double counting parent 
> > allocation */
> > +   pgstat_zero_my_allocated_bytes();
> > +
> > /* in postmaster child ... */
> > InitPostmasterChild();
> >  
> > diff --git a/src/backend/postmaster/postmaster.c 
> > b/src/backend/postmaster/postmaster.c
> > index eac3450774..24278e5c18 100644
> > --- a/src/backend/postmaster/postmaster.c
> > +++ b/src/backend/postmaster/postmaster.c
> > @@ -4102,6 +4102,9 @@ BackendStartup(Port *port)
> > {
> > free(bn);
> >  
> > +   /* Zero allocated bytes to avoid double counting parent 
> > allocation */
> > +   pgstat_zero_my_allocated_bytes();
> > +
> > /* Detangle from postmaster */
> > InitPostmasterChild();
> 
> 
> It doesn't at all seem right to call pgstat_zero_my_allocated_bytes() here,
> before even InitPostmasterChild() is called. Nor does it seem right to add the
> call to so many places.
> 
> Note that this is before we even delete postmaster's memory, see e.g.:
>   /*
>* If the PostmasterContext is still around, recycle the space; we don't
>* need it anymore after InitPostgres completes.  Note this does not 
> trash
>* *MyProcPort, because ConnCreate() allocated that space with malloc()
>* ... else we'd need to copy the Port data first.  Also, subsidiary 
> data
>* such as the username isn't lost either; see ProcessStartupPacket().
>*/
>   if (PostmasterContext)
>   {
>   MemoryContextDelete(PostmasterContext);
>   PostmasterContext = NULL;
>   }
> 
> calling pgstat_zero_my_allocated_bytes() before we do this will lead to
> undercounting memory usage, afaict.
> 

OK - I'll trace back through these and see if I can better locate and reduce the
number of invocations.

> > +/* Enum helper for reporting memory allocated bytes */
> > +enum allocation_direction
> > +{
> > +   PG_ALLOC_DECREASE = -1,
> > +   PG_ALLOC_IGNORE,
> > +   PG_ALLOC_INCREASE,
> > +};
> 
> What's the point of this?
> 
> 
> > +/* --
> > + * pgstat_report_allocated_bytes() -
> > + *
> > + *  Called to report change in memory allocated for this backend.
> > + *
> > + * my_allocated_bytes initially points to local memory, making it safe to 
> > call
> > + * this before pgstats has been initialized. allocation_direction is a
> > + * positive/negative multiplier enum defined above.
> > + * --
> > + */
> > +static inline void
> > +pgstat_report_allocated_bytes(int64 allocated_bytes, int 
> > allocation_direction)
> 
> I don't think this should take allocation_direction as a parameter, I'd make
> it two different functions.

Originally it was two functions, a suggestion was made in the thread to
maybe consolidate them to a single function with a direction indicator,
hence the above.  I'm fine with converting it back to separate functions.

> 
> > +   if (allocation_direction == PG_ALLOC_DECREASE &&
> > +   pg_sub_u64_overflow(*my_allocated_bytes, allocated_bytes, 
> > ))
> > +   {
> > +   *my_allocated_bytes = 0;
> > +   ereport(LOG,
> > +   errmsg("Backend %d deallocated %lld bytes,

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-01-09 Thread Andres Freund

Hi,

On 2023-01-05 13:44:20 -0500, Reid Thompson wrote:
> From 0a6b152e0559a25033bd7d43eb0959432e0d Mon Sep 17 00:00:00 2001
> From: Reid Thompson 
> Date: Thu, 11 Aug 2022 12:01:25 -0400
> Subject: [PATCH 1/2] Add tracking of backend memory allocated to
>  pg_stat_activity
> 
> This new field displays the current bytes of memory allocated to the
> backend process. It is updated as memory for the process is
> malloc'd/free'd. Memory allocated to items on the freelist is included in
> the displayed value.

It doesn't actually malloc/free. It tracks palloc/pfree.


> Dynamic shared memory allocations are included only in the value displayed
> for the backend that created them, they are not included in the value for
> backends that are attached to them to avoid double counting.

As mentioned before, I don't think accounting DSM this way makes sense.


> --- a/src/backend/postmaster/autovacuum.c
> +++ b/src/backend/postmaster/autovacuum.c
> @@ -407,6 +407,9 @@ StartAutoVacLauncher(void)
>  
>  #ifndef EXEC_BACKEND
>   case 0:
> + /* Zero allocated bytes to avoid double counting parent 
> allocation */
> + pgstat_zero_my_allocated_bytes();
> +
>   /* in postmaster child ... */
>   InitPostmasterChild();



> @@ -1485,6 +1488,9 @@ StartAutoVacWorker(void)
>  
>  #ifndef EXEC_BACKEND
>   case 0:
> + /* Zero allocated bytes to avoid double counting parent 
> allocation */
> + pgstat_zero_my_allocated_bytes();
> +
>   /* in postmaster child ... */
>   InitPostmasterChild();
>  
> diff --git a/src/backend/postmaster/postmaster.c 
> b/src/backend/postmaster/postmaster.c
> index eac3450774..24278e5c18 100644
> --- a/src/backend/postmaster/postmaster.c
> +++ b/src/backend/postmaster/postmaster.c
> @@ -4102,6 +4102,9 @@ BackendStartup(Port *port)
>   {
>   free(bn);
>  
> + /* Zero allocated bytes to avoid double counting parent 
> allocation */
> + pgstat_zero_my_allocated_bytes();
> +
>   /* Detangle from postmaster */
>   InitPostmasterChild();


It doesn't at all seem right to call pgstat_zero_my_allocated_bytes() here,
before even InitPostmasterChild() is called. Nor does it seem right to add the
call to so many places.

Note that this is before we even delete postmaster's memory, see e.g.:
/*
 * If the PostmasterContext is still around, recycle the space; we don't
 * need it anymore after InitPostgres completes.  Note this does not 
trash
 * *MyProcPort, because ConnCreate() allocated that space with malloc()
 * ... else we'd need to copy the Port data first.  Also, subsidiary 
data
 * such as the username isn't lost either; see ProcessStartupPacket().
 */
if (PostmasterContext)
{
MemoryContextDelete(PostmasterContext);
PostmasterContext = NULL;
}

calling pgstat_zero_my_allocated_bytes() before we do this will lead to
undercounting memory usage, afaict.


> +/* Enum helper for reporting memory allocated bytes */
> +enum allocation_direction
> +{
> + PG_ALLOC_DECREASE = -1,
> + PG_ALLOC_IGNORE,
> + PG_ALLOC_INCREASE,
> +};

What's the point of this?


> +/* --
> + * pgstat_report_allocated_bytes() -
> + *
> + *  Called to report change in memory allocated for this backend.
> + *
> + * my_allocated_bytes initially points to local memory, making it safe to 
> call
> + * this before pgstats has been initialized. allocation_direction is a
> + * positive/negative multiplier enum defined above.
> + * --
> + */
> +static inline void
> +pgstat_report_allocated_bytes(int64 allocated_bytes, int 
> allocation_direction)

I don't think this should take allocation_direction as a parameter, I'd make
it two different functions.


> +{
> + uint64  temp;
> +
> + /*
> +  * Avoid *my_allocated_bytes unsigned integer overflow on
> +  * PG_ALLOC_DECREASE
> +  */
> + if (allocation_direction == PG_ALLOC_DECREASE &&
> + pg_sub_u64_overflow(*my_allocated_bytes, allocated_bytes, 
> ))
> + {
> + *my_allocated_bytes = 0;
> + ereport(LOG,
> + errmsg("Backend %d deallocated %lld bytes, 
> exceeding the %llu bytes it is currently reporting allocated. Setting 
> reported to 0.",
> +MyProcPid, (long long) 
> allocated_bytes,
> +(unsigned long long) 
> *my_allocated_bytes));

We certainly shouldn't have an ereport in here. This stuff really needs to be
cheap.


> + }
> + else
> + *my_allocated_bytes += (allocated_bytes) * allocation_direction;

Superfluous parens?



> +/* --
> + * pgstat_get_all_memory_allocated() -
> + *
> + *   Return a uint64

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-01-05 Thread Reid Thompson

On Tue, 2023-01-03 at 16:22 +0530, vignesh C wrote:
> 
> The patch does not apply on top of HEAD as in [1], please post a
> rebased patch:
> ...
> Regards,
> Vignesh
>

Attached is rebased patch, with some updates related to committed changes.

Thanks,
Reid

-- 
Reid Thompson
Senior Software Engineer
Crunchy Data, Inc.

reid.thomp...@crunchydata.com
www.crunchydata.com

From 69516942b71d5d41850fbc00b971db7476c7a01a Mon Sep 17 00:00:00 2001
From: Reid Thompson 
Date: Sat, 4 Jun 2022 22:23:59 -0400
Subject: [PATCH 2/2] Add the ability to limit the amount of memory that can be
 allocated to backends.

This builds on the work that adds backend memory allocated to pg_stat_activity.

Add GUC variable max_total_backend_memory.

Specifies a limit to the amount of memory (in MB) that may be allocated to
backends in total (i.e. this is not a per user or per backend limit). If unset,
or set to 0 it is disabled. It is intended as a resource to help avoid the OOM
killer on LINUX and manage resources in general. A backend request that would
push the total over the limit will be denied with an out of memory error causing
that backend's current query/transaction to fail. Due to the dynamic nature of
memory allocations, this limit is not exact. If within 1.5MB of the limit and
two backends request 1MB each at the same time both may be allocated, and exceed
the limit. Further requests will not be allocated until dropping below the
limit. Keep this in mind when setting this value. This limit does not affect
auxiliary backend processes. Backend memory allocations are displayed in the
pg_stat_activity view.
---
 doc/src/sgml/config.sgml  |  26 +
 src/backend/storage/ipc/dsm_impl.c|  12 ++
 src/backend/utils/activity/backend_status.c   | 108 ++
 src/backend/utils/misc/guc_tables.c   |  11 ++
 src/backend/utils/misc/postgresql.conf.sample |   3 +
 src/backend/utils/mmgr/aset.c |  17 +++
 src/backend/utils/mmgr/generation.c   |   9 ++
 src/backend/utils/mmgr/slab.c |   9 +-
 src/include/utils/backend_status.h|   3 +
 9 files changed, 197 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 05b3862d09..0362f26451 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2079,6 +2079,32 @@ include_dir 'conf.d'
   
  
 
+ 
+  max_total_backend_memory (integer)
+  
+   max_total_backend_memory configuration parameter
+  
+  
+  
+   
+Specifies a limit to the amount of memory (MB) that may be allocated to
+backends in total (i.e. this is not a per user or per backend limit).
+If unset, or set to 0 it is disabled.  A backend request that would
+push the total over the limit will be denied with an out of memory
+error causing that backend's current query/transaction to fail. Due to
+the dynamic nature of memory allocations, this limit is not exact. If
+within 1.5MB of the limit and two backends request 1MB each at the same
+time both may be allocated, and exceed the limit. Further requests will
+not be allocated until dropping below the limit. Keep this in mind when
+setting this value. This limit does not affect auxiliary backend
+processes  . Backend memory
+allocations (allocated_bytes) are displayed in the
+pg_stat_activity
+view.
+   
+  
+ 
+
  
  
 
diff --git a/src/backend/storage/ipc/dsm_impl.c b/src/backend/storage/ipc/dsm_impl.c
index 22885c7bd2..f7047107d5 100644
--- a/src/backend/storage/ipc/dsm_impl.c
+++ b/src/backend/storage/ipc/dsm_impl.c
@@ -254,6 +254,10 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size,
 		return true;
 	}
 
+	/* Do not exceed maximum allowed memory allocation */
+	if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size))
+		return false;
+
 	/*
 	 * Create new segment or open an existing one for attach.
 	 *
@@ -525,6 +529,10 @@ dsm_impl_sysv(dsm_op op, dsm_handle handle, Size request_size,
 		int			flags = IPCProtection;
 		size_t		segsize;
 
+		/* Do not exceed maximum allowed memory allocation */
+		if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size))
+			return false;
+
 		/*
 		 * Allocate the memory BEFORE acquiring the resource, so that we don't
 		 * leak the resource if memory allocation fails.
@@ -719,6 +727,10 @@ dsm_impl_windows(dsm_op op, dsm_handle handle, Size request_size,
 		return true;
 	}
 
+	/* Do not exceed maximum allowed memory allocation */
+	if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size))
+		return false;
+
 	/* Create new segment or open an existing one for attach. */
 	if (op == DSM_OP_CREATE)
 	{
diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c
index 7baf2db57d..da2b5fb042 100644
---

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-01-03 Thread vignesh C

On Fri, 9 Dec 2022 at 20:41, Reid Thompson
 wrote:
>
> On Tue, 2022-12-06 at 10:32 -0800, Andres Freund wrote:
> > Hi,
> >
> > On 2022-11-26 22:22:15 -0500, Reid Thompson wrote:
> > > rebased/patched to current master && current pg-stat-activity-
> > > backend-memory-allocated
> >
> > This version fails to build with msvc, and builds with warnings on
> > other
> > platforms.
> > https://cirrus-ci.com/build/5410696721072128
> > msvc:
> >
> > Andres Freund
>
> updated patches

The patch does not apply on top of HEAD as in [1], please post a rebased patch:
=== Applying patches on top of PostgreSQL commit ID
92957ed98c5c565362ce665266132a7f08f6b0c0 ===
=== applying patch
./0001-Add-tracking-of-backend-memory-allocated-to-pg_stat_.patch
...
patching file src/backend/utils/mmgr/slab.c
Hunk #1 succeeded at 69 (offset 16 lines).
Hunk #2 succeeded at 414 (offset 175 lines).
Hunk #3 succeeded at 436 with fuzz 2 (offset 176 lines).
Hunk #4 FAILED at 286.
Hunk #5 succeeded at 488 (offset 186 lines).
Hunk #6 FAILED at 381.
Hunk #7 FAILED at 554.
3 out of 7 hunks FAILED -- saving rejects to file
src/backend/utils/mmgr/slab.c.rej

[1] - http://cfbot.cputube.org/patch_41_3867.log

Regards,
Vignesh

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2022-12-09 Thread Reid Thompson

On Tue, 2022-12-06 at 10:32 -0800, Andres Freund wrote:
> Hi,
> 
> On 2022-11-26 22:22:15 -0500, Reid Thompson wrote:
> > rebased/patched to current master && current pg-stat-activity-
> > backend-memory-allocated
> 
> This version fails to build with msvc, and builds with warnings on
> other
> platforms.
> https://cirrus-ci.com/build/5410696721072128
> msvc:
> 
> Andres Freund

updated patches

-- 
Reid Thompson
Senior Software Engineer
Crunchy Data, Inc.

reid.thomp...@crunchydata.com
www.crunchydata.com
From e48292eaf402bfe397f1c2fdc1b3efd8cd0a9137 Mon Sep 17 00:00:00 2001
From: Reid Thompson 
Date: Sat, 4 Jun 2022 22:23:59 -0400
Subject: [PATCH 2/2] Add the ability to limit the amount of memory that can be
 allocated to backends.

This builds on the work that adds backend memory allocated to pg_stat_activity.

Add GUC variable max_total_backend_memory.

Specifies a limit to the amount of memory (in MB) that may be allocated to
backends in total (i.e. this is not a per user or per backend limit). If unset,
or set to 0 it is disabled. It is intended as a resource to help avoid the OOM
killer on LINUX and manage resources in general. A backend request that would
push the total over the limit will be denied with an out of memory error causing
that backend's current query/transaction to fail. Due to the dynamic nature of
memory allocations, this limit is not exact. If within 1.5MB of the limit and
two backends request 1MB each at the same time both may be allocated, and exceed
the limit. Further requests will not be allocated until dropping below the
limit. Keep this in mind when setting this value. This limit does not affect
auxiliary backend processes. Backend memory allocations are displayed in the
pg_stat_activity view.
---
 doc/src/sgml/config.sgml  |  26 +
 src/backend/storage/ipc/dsm_impl.c|  12 ++
 src/backend/utils/activity/backend_status.c   | 108 ++
 src/backend/utils/misc/guc_tables.c   |  11 ++
 src/backend/utils/misc/postgresql.conf.sample |   3 +
 src/backend/utils/mmgr/aset.c |  17 +++
 src/backend/utils/mmgr/generation.c   |   9 ++
 src/backend/utils/mmgr/slab.c |   8 ++
 src/include/utils/backend_status.h|   3 +
 9 files changed, 197 insertions(+)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index ff6fcd902a..2899f109ac 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2079,6 +2079,32 @@ include_dir 'conf.d'
   
  
 
+ 
+  max_total_backend_memory (integer)
+  
+   max_total_backend_memory configuration parameter
+  
+  
+  
+   
+Specifies a limit to the amount of memory (MB) that may be allocated to
+backends in total (i.e. this is not a per user or per backend limit).
+If unset, or set to 0 it is disabled.  A backend request that would
+push the total over the limit will be denied with an out of memory
+error causing that backend's current query/transaction to fail. Due to
+the dynamic nature of memory allocations, this limit is not exact. If
+within 1.5MB of the limit and two backends request 1MB each at the same
+time both may be allocated, and exceed the limit. Further requests will
+not be allocated until dropping below the limit. Keep this in mind when
+setting this value. This limit does not affect auxiliary backend
+processes  . Backend memory
+allocations (allocated_bytes) are displayed in the
+pg_stat_activity
+view.
+   
+  
+ 
+
  
  
 
diff --git a/src/backend/storage/ipc/dsm_impl.c b/src/backend/storage/ipc/dsm_impl.c
index 36ef3e425e..58fb690c69 100644
--- a/src/backend/storage/ipc/dsm_impl.c
+++ b/src/backend/storage/ipc/dsm_impl.c
@@ -254,6 +254,10 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size,
 		return true;
 	}
 
+	/* Do not exceed maximum allowed memory allocation */
+	if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size))
+		return false;
+
 	/*
 	 * Create new segment or open an existing one for attach.
 	 *
@@ -525,6 +529,10 @@ dsm_impl_sysv(dsm_op op, dsm_handle handle, Size request_size,
 		int			flags = IPCProtection;
 		size_t		segsize;
 
+		/* Do not exceed maximum allowed memory allocation */
+		if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size))
+			return false;
+
 		/*
 		 * Allocate the memory BEFORE acquiring the resource, so that we don't
 		 * leak the resource if memory allocation fails.
@@ -719,6 +727,10 @@ dsm_impl_windows(dsm_op op, dsm_handle handle, Size request_size,
 		return true;
 	}
 
+	/* Do not exceed maximum allowed memory allocation */
+	if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size))
+		return false;
+
 	/* Create new segment or open an existing one for attach. */
 	if (op == DSM_OP_CREATE)
 	{
diff --git

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2022-12-06 Thread Andres Freund

Hi,

On 2022-11-26 22:22:15 -0500, Reid Thompson wrote:
> rebased/patched to current master && current 
> pg-stat-activity-backend-memory-allocated

This version fails to build with msvc, and builds with warnings on other
platforms.
https://cirrus-ci.com/build/5410696721072128
msvc:
[20:26:51.286] c:\cirrus\src\include\utils/backend_status.h(40): error C2059: 
syntax error: 'constant'

mingw cross:
[20:26:26.358]  from /usr/share/mingw-w64/include/winsock2.h:23,
[20:26:26.358]  from ../../src/include/port/win32_port.h:60,
[20:26:26.358]  from ../../src/include/port.h:24,
[20:26:26.358]  from ../../src/include/c.h:1306,
[20:26:26.358]  from ../../src/include/postgres.h:47,
[20:26:26.358]  from controldata_utils.c:18:
[20:26:26.358] ../../src/include/utils/backend_status.h:40:2: error: expected 
identifier before numeric constant
[20:26:26.358]40 |  IGNORE,
[20:26:26.358]   |  ^~
[20:26:26.358] In file included from ../../src/include/postgres.h:48,
[20:26:26.358]  from controldata_utils.c:18:
[20:26:26.358] ../../src/include/utils/backend_status.h: In function 
‘pgstat_report_allocated_bytes’:
[20:26:26.358] ../../src/include/utils/backend_status.h:365:12: error: format 
‘%ld’ expects argument of type ‘long int’, but argument 3 has type ‘uint64’ 
{aka ‘long long unsigned int’} [-Werror=format=]
[20:26:26.358]   365 | errmsg("Backend %d deallocated %ld bytes, exceeding 
the %ld bytes it is currently reporting allocated. Setting reported to 0.",
[20:26:26.358]   |
^~~
[20:26:26.358]   366 | MyProcPid, allocated_bytes, 
*my_allocated_bytes));
[20:26:26.358]   |~~~
[20:26:26.358]   ||
[20:26:26.358]   |uint64 {aka long long unsigned int}

Due to windows having long be 32bit, you need to use %lld. Our custom to deal
with that is to cast the argument to errmsg as long long unsigned and use
%llu.

Btw, given that the argument is uint64, it doesn't seem correct to use %ld,
that's signed. Not that it's going to matter, but ...

Greetings,

Andres Freund

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2022-11-26 Thread Reid Thompson

On Thu, 2022-11-03 at 11:48 -0400, Reid Thompson wrote:
> On Tue, 2022-10-25 at 11:49 -0400, Reid Thompson wrote:
>
> Rebased to current. Add a couple changes per conversation with D
> Christensen (include units in field name, group field with
> backend_xid
> and backend_xmin fields in pg_stat_activity view, rather than between
> query_id and query)
> 

rebased/patched to current master && current 
pg-stat-activity-backend-memory-allocated 

-- 
Reid Thompson
Senior Software Engineer
Crunchy Data, Inc.

reid.thomp...@crunchydata.com
www.crunchydata.com

From 1470f45e086bef0757cc262d10e08904e46b9a88 Mon Sep 17 00:00:00 2001
From: Reid Thompson 
Date: Sat, 4 Jun 2022 22:23:59 -0400
Subject: [PATCH 2/2] Add the ability to limit the amount of memory that can be
 allocated to backends.

This builds on the work that adds backend memory allocated to pg_stat_activity.

Add GUC variable max_total_backend_memory.

Specifies a limit to the amount of memory (in MB) that may be allocated to
backends in total (i.e. this is not a per user or per backend limit). If unset,
or set to 0 it is disabled. It is intended as a resource to help avoid the OOM
killer on LINUX and manage resources in general. A backend request that would
push the total over the limit will be denied with an out of memory error causing
that backend's current query/transaction to fail. Due to the dynamic nature of
memory allocations, this limit is not exact. If within 1.5MB of the limit and
two backends request 1MB each at the same time both may be allocated, and exceed
the limit. Further requests will not be allocated until dropping below the
limit. Keep this in mind when setting this value. This limit does not affect
auxiliary backend processes. Backend memory allocations are displayed in the
pg_stat_activity view.
---
 doc/src/sgml/config.sgml  |  26 +
 src/backend/storage/ipc/dsm_impl.c|  12 ++
 src/backend/utils/activity/backend_status.c   | 108 ++
 src/backend/utils/misc/guc_tables.c   |  11 ++
 src/backend/utils/misc/postgresql.conf.sample |   3 +
 src/backend/utils/mmgr/aset.c |  17 +++
 src/backend/utils/mmgr/generation.c   |   9 ++
 src/backend/utils/mmgr/slab.c |   8 ++
 src/include/utils/backend_status.h|   3 +
 9 files changed, 197 insertions(+)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 24b1624bad..c2db3ace7a 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2079,6 +2079,32 @@ include_dir 'conf.d'
   
  
 
+ 
+  max_total_backend_memory (integer)
+  
+   max_total_backend_memory configuration parameter
+  
+  
+  
+   
+Specifies a limit to the amount of memory (MB) that may be allocated to
+backends in total (i.e. this is not a per user or per backend limit).
+If unset, or set to 0 it is disabled.  A backend request that would
+push the total over the limit will be denied with an out of memory
+error causing that backend's current query/transaction to fail. Due to
+the dynamic nature of memory allocations, this limit is not exact. If
+within 1.5MB of the limit and two backends request 1MB each at the same
+time both may be allocated, and exceed the limit. Further requests will
+not be allocated until dropping below the limit. Keep this in mind when
+setting this value. This limit does not affect auxiliary backend
+processes  . Backend memory
+allocations (allocated_bytes) are displayed in the
+pg_stat_activity
+view.
+   
+  
+ 
+
  
  
 
diff --git a/src/backend/storage/ipc/dsm_impl.c b/src/backend/storage/ipc/dsm_impl.c
index 65d59fc43e..8d9df676af 100644
--- a/src/backend/storage/ipc/dsm_impl.c
+++ b/src/backend/storage/ipc/dsm_impl.c
@@ -254,6 +254,10 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size,
 		return true;
 	}
 
+	/* Do not exceed maximum allowed memory allocation */
+	if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size))
+		return false;
+
 	/*
 	 * Create new segment or open an existing one for attach.
 	 *
@@ -525,6 +529,10 @@ dsm_impl_sysv(dsm_op op, dsm_handle handle, Size request_size,
 		int			flags = IPCProtection;
 		size_t		segsize;
 
+		/* Do not exceed maximum allowed memory allocation */
+		if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size))
+			return false;
+
 		/*
 		 * Allocate the memory BEFORE acquiring the resource, so that we don't
 		 * leak the resource if memory allocation fails.
@@ -719,6 +727,10 @@ dsm_impl_windows(dsm_op op, dsm_handle handle, Size request_size,
 		return true;
 	}
 
+	/* Do not exceed maximum allowed memory allocation */
+	if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size))
+		return false;
+
 	/* Create new segment or open an existing one for attach. */
 	if (op == DSM_OP_CREATE)

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2022-11-03 Thread Reid Thompson

On Tue, 2022-10-25 at 11:49 -0400, Reid Thompson wrote:
> Hi Arne,
> 
> On Mon, 2022-10-24 at 15:27 +, Arne Roland wrote:
> > Hello Reid,
> > 
> > could you rebase the patch again? It doesn't apply currently
> > (http://cfbot.cputube.org/patch_40_3867.log). Thanks!
> 
> rebased patches attached.

Rebased to current. Add a couple changes per conversation with D
Christensen (include units in field name, group field with backend_xid
and backend_xmin fields in pg_stat_activity view, rather than between
query_id and query)

-- 
Reid Thompson
Senior Software Engineer
Crunchy Data, Inc.

reid.thomp...@crunchydata.com
www.crunchydata.com
From 9cf35c79be107feedb63f6f674ac9d2347d1875e Mon Sep 17 00:00:00 2001
From: Reid Thompson 
Date: Sat, 4 Jun 2022 22:23:59 -0400
Subject: [PATCH 2/2] Add the ability to limit the amount of memory that can be
 allocated to backends.

This builds on the work that adds backend memory allocated to pg_stat_activity.

Add GUC variable max_total_backend_memory.

Specifies a limit to the amount of memory (in MB) that may be allocated to
backends in total (i.e. this is not a per user or per backend limit). If unset,
or set to 0 it is disabled. It is intended as a resource to help avoid the OOM
killer on LINUX and manage resources in general. A backend request that would
push the total over the limit will be denied with an out of memory error causing
that backend's current query/transaction to fail. Due to the dynamic nature of
memory allocations, this limit is not exact. If within 1.5MB of the limit and
two backends request 1MB each at the same time both may be allocated, and exceed
the limit. Further requests will not be allocated until dropping below the
limit. Keep this in mind when setting this value. This limit does not affect
auxiliary backend processes. Backend memory allocations are displayed in the
pg_stat_activity view.
---
 doc/src/sgml/config.sgml  |  26 
 src/backend/storage/ipc/dsm_impl.c|  12 ++
 src/backend/utils/activity/backend_status.c   | 111 +-
 src/backend/utils/misc/guc_tables.c   |  11 ++
 src/backend/utils/misc/postgresql.conf.sample |   3 +
 src/backend/utils/mmgr/aset.c |  17 +++
 src/backend/utils/mmgr/generation.c   |   9 ++
 src/backend/utils/mmgr/slab.c |   8 ++
 src/include/utils/backend_status.h|   2 +
 9 files changed, 198 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 559eb898a9..5762999fa5 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2079,6 +2079,32 @@ include_dir 'conf.d'
   
  
 
+ 
+  max_total_backend_memory (integer)
+  
+   max_total_backend_memory configuration parameter
+  
+  
+  
+   
+Specifies a limit to the amount of memory (MB) that may be allocated to
+backends in total (i.e. this is not a per user or per backend limit).
+If unset, or set to 0 it is disabled.  A backend request that would
+push the total over the limit will be denied with an out of memory
+error causing that backend's current query/transaction to fail. Due to
+the dynamic nature of memory allocations, this limit is not exact. If
+within 1.5MB of the limit and two backends request 1MB each at the same
+time both may be allocated, and exceed the limit. Further requests will
+not be allocated until dropping below the limit. Keep this in mind when
+setting this value. This limit does not affect auxiliary backend
+processes  . Backend memory
+allocations (backend_allocated_bytes) are displayed in
+the pg_stat_activity
+view.
+   
+  
+ 
+
  
  
 
diff --git a/src/backend/storage/ipc/dsm_impl.c b/src/backend/storage/ipc/dsm_impl.c
index eae03159c3..aaf74e9486 100644
--- a/src/backend/storage/ipc/dsm_impl.c
+++ b/src/backend/storage/ipc/dsm_impl.c
@@ -254,6 +254,10 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size,
 		return true;
 	}
 
+	/* Do not exceed maximum allowed memory allocation */
+	if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size))
+		return false;
+
 	/*
 	 * Create new segment or open an existing one for attach.
 	 *
@@ -525,6 +529,10 @@ dsm_impl_sysv(dsm_op op, dsm_handle handle, Size request_size,
 		int			flags = IPCProtection;
 		size_t		segsize;
 
+		/* Do not exceed maximum allowed memory allocation */
+		if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size))
+			return false;
+
 		/*
 		 * Allocate the memory BEFORE acquiring the resource, so that we don't
 		 * leak the resource if memory allocation fails.
@@ -719,6 +727,10 @@ dsm_impl_windows(dsm_op op, dsm_handle handle, Size request_size,
 		return true;
 	}
 
+	/* Do not exceed maximum allowed memory allocation */
+	if (op == DSM_OP_CREATE && exceeds_max_total_bkend_mem(request_size))
+

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2022-10-25 Thread Reid Thompson

Hi Arne,

On Mon, 2022-10-24 at 15:27 +, Arne Roland wrote:
> Hello Reid,
> 
> could you rebase the patch again? It doesn't apply currently
> (http://cfbot.cputube.org/patch_40_3867.log). Thanks!

rebased patches attached.

> You mention, that you want to prevent the compiler from getting
> cute.I don't think this comments are exactly helpful in the current
> state. I think probably fine to just omit them.

I attempted to follow previous convention when adding code and these
comments have been consistently applied throughout backend_status.c
where a volatile pointer is being used.

> I don't understand the purpose of the result variable in
> exceeds_max_total_bkend_mem. What purpose does it serve?
> 
> I really like the simplicity of the suggestion here to prevent oom.

If max_total_backend_memory is configured, exceeds_max_total_bkend_mem()
will return true if an allocation request will push total backend memory
allocated over the configured value.

exceeds_max_total_bkend_mem() is implemented in the various allocators
along the lines of
...snip...
/* Do not exceed maximum allowed memory allocation */
if (exceeds_max_total_bkend_mem('new request size'))

return NULL;

...snip...
Do not allocate the memory requested, return NULL instead. PG already
had code in place to handle NULL returns from allocation requests.

The allocation code in aset.c, slab.c, generation.c, dsm_impl.c utilizes
 exceeds_max_total_bkend_mem()

max_total_backend_memory (integer)
Specifies a limit to the amount of memory (MB) that may be allocated
to backends in total (i.e. this is not a per user or per backend limit).
If unset, or set to 0 it is disabled. A backend request that would push
the total over the limit will be denied with an out of memory error
causing that backend's current query/transaction to fail. Due to the
dynamic nature of memory allocations, this limit is not exact. If within
1.5MB of the limit and two backends request 1MB each at the same time
both may be allocated, and exceed the limit. Further requests will not
be allocated until dropping below the limit. Keep this in mind when
setting this value. This limit does not affect auxiliary backend
processes Auxiliary process . Backend memory allocations
(backend_mem_allocated) are displayed in the pg_stat_activity view.

> I intent to play around with a lot of backends, once I get a rebased
> patch. 
> 
> Regards
> Arne

From ab654a48ec7bfbc3bc377c5757a04f1756e72e79 Mon Sep 17 00:00:00 2001
From: Reid Thompson 
Date: Thu, 11 Aug 2022 12:01:25 -0400
Subject: [PATCH 1/2] Add tracking of backend memory allocated to
 pg_stat_activity

This new field displays the current bytes of memory allocated to the
backend process. It is updated as memory for the process is
malloc'd/free'd. Memory allocated to items on the freelist is included in
the displayed value.  Dynamic shared memory allocations are included
only in the value displayed for the backend that created them, they are
not included in the value for backends that are attached to them to
avoid double counting. On occasion, orphaned memory segments may be
cleaned up on postmaster startup. This may result in decreasing the sum
without a prior increment. We limit the floor of backend_mem_allocated
to zero. Updated pg_stat_activity documentation for the new column.
---
 doc/src/sgml/monitoring.sgml|  12 +++
 src/backend/catalog/system_views.sql|   1 +
 src/backend/storage/ipc/dsm_impl.c  |  80 +++
 src/backend/utils/activity/backend_status.c | 105 
 src/backend/utils/adt/pgstatfuncs.c |   4 +-
 src/backend/utils/mmgr/aset.c   |  18 
 src/backend/utils/mmgr/generation.c |  15 +++
 src/backend/utils/mmgr/slab.c   |  21 
 src/include/catalog/pg_proc.dat |   6 +-
 src/include/utils/backend_status.h  |   7 +-
 src/test/regress/expected/rules.out |   9 +-
 11 files changed, 269 insertions(+), 9 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index e5d622d514..4983bbc814 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -947,6 +947,18 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34   0:00 postgres: ser

+
+ 
+  backend_mem_allocated bigint
+ 
+ 
+  The byte count of memory allocated to this backend. Dynamic shared memory
+  allocations are included only in the value displayed for the backend that
+  created them, they are not included in the value for backends that are
+  attached to them to avoid double counting.
+ 
+
+

query text
diff --git a/src/backend/catalog/system_views.sql

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2022-10-24 Thread Arne Roland

Hello Reid,

could you rebase the patch again? It doesn't apply currently 
(http://cfbot.cputube.org/patch_40_3867.log). Thanks!

You mention, that you want to prevent the compiler from getting cute.

I don't think this comments are exactly helpful in the current state. I think 
probably fine to just omit them.

I don't understand the purpose of the result variable in 
exceeds_max_total_bkend_mem. What purpose does it serve?

I really like the simplicity of the suggestion here to prevent oom.

I intent to play around with a lot of backends, once I get a rebased patch.

Regards

Arne

From: Reid Thompson 
Sent: Thursday, September 15, 2022 4:58:19 PM
To: Ibrar Ahmed; pgsql-hackers@lists.postgresql.org
Cc: reid.thomp...@crunchydata.com; Justin Pryzby
Subject: Re: Add the ability to limit the amount of memory that can be 
allocated to backends.

On Thu, 2022-09-15 at 12:07 +0400, Ibrar Ahmed wrote:
>
> The patch does not apply; please rebase the patch.
>
> patching file src/backend/utils/misc/guc.c
> Hunk #1 FAILED at 3664.
> 1 out of 1 hunk FAILED -- saving rejects to file
> src/backend/utils/misc/guc.c.rej
>
> patching file src/backend/utils/misc/postgresql.conf.sample
>

rebased patches attached.

Thanks,
Reid

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2022-09-15 Thread Reid Thompson

On Thu, 2022-09-15 at 12:07 +0400, Ibrar Ahmed wrote:
> 
> The patch does not apply; please rebase the patch.
> 
> patching file src/backend/utils/misc/guc.c
> Hunk #1 FAILED at 3664.
> 1 out of 1 hunk FAILED -- saving rejects to file
> src/backend/utils/misc/guc.c.rej 
> 
> patching file src/backend/utils/misc/postgresql.conf.sample
> 

rebased patches attached.

Thanks,
Reid

From 0e7010c53508d5a396edd16fd9166abe431f5dbe Mon Sep 17 00:00:00 2001
From: Reid Thompson 
Date: Thu, 11 Aug 2022 12:01:25 -0400
Subject: [PATCH 1/2] Add tracking of backend memory allocated to
 pg_stat_activity

This new field displays the current bytes of memory allocated to the
backend process. It is updated as memory for the process is
malloc'd/free'd. Memory allocated to items on the freelist is included in
the displayed value.  Dynamic shared memory allocations are included
only in the value displayed for the backend that created them, they are
not included in the value for backends that are attached to them to
avoid double counting. On occasion, orphaned memory segments may be
cleaned up on postmaster startup. This may result in decreasing the sum
without a prior increment. We limit the floor of backend_mem_allocated
to zero. Updated pg_stat_activity documentation for the new column.
---
 doc/src/sgml/monitoring.sgml|  12 +++
 src/backend/catalog/system_views.sql|   1 +
 src/backend/storage/ipc/dsm_impl.c  |  80 +++
 src/backend/utils/activity/backend_status.c | 105 
 src/backend/utils/adt/pgstatfuncs.c |   4 +-
 src/backend/utils/mmgr/aset.c   |  18 
 src/backend/utils/mmgr/generation.c |  15 +++
 src/backend/utils/mmgr/slab.c   |  21 
 src/include/catalog/pg_proc.dat |   6 +-
 src/include/utils/backend_status.h  |   7 +-
 src/test/regress/expected/rules.out |   9 +-
 11 files changed, 269 insertions(+), 9 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 1d9509a2f6..40ae638f25 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -947,6 +947,18 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34   0:00 postgres: ser
  
 
 
+
+ 
+  backend_mem_allocated bigint
+ 
+ 
+  The byte count of memory allocated to this backend. Dynamic shared memory
+  allocations are included only in the value displayed for the backend that
+  created them, they are not included in the value for backends that are
+  attached to them to avoid double counting.
+ 
+
+
  
   
query text
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 55f7ec79e0..a78750ab12 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -862,6 +862,7 @@ CREATE VIEW pg_stat_activity AS
 S.backend_xid,
 s.backend_xmin,
 S.query_id,
+S.backend_mem_allocated,
 S.query,
 S.backend_type
 FROM pg_stat_get_activity(NULL) AS S
diff --git a/src/backend/storage/ipc/dsm_impl.c b/src/backend/storage/ipc/dsm_impl.c
index e1b90c5de4..3356bb65b5 100644
--- a/src/backend/storage/ipc/dsm_impl.c
+++ b/src/backend/storage/ipc/dsm_impl.c
@@ -66,6 +66,7 @@
 #include "postmaster/postmaster.h"
 #include "storage/dsm_impl.h"
 #include "storage/fd.h"
+#include "utils/backend_status.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
 
@@ -232,6 +233,13 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size,
 			name)));
 			return false;
 		}
+		/*
+		 * Detach and destroy pass through here, only decrease the memory
+		 * shown allocated in pg_stat_activity when the creator destroys the
+		 * allocation.
+		 */
+		if (op == DSM_OP_DESTROY)
+			pgstat_report_backend_mem_allocated_decrease(*mapped_size);
 		*mapped_address = NULL;
 		*mapped_size = 0;
 		if (op == DSM_OP_DESTROY && shm_unlink(name) != 0)
@@ -332,6 +340,36 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size,
 		name)));
 		return false;
 	}
+
+	/*
+	 * Attach and create pass through here, only update backend memory
+	 * allocated in pg_stat_activity for the creator process.
+	 */
+	if (op == DSM_OP_CREATE)
+	{
+		/*
+		 * Posix creation calls dsm_impl_posix_resize implying that resizing
+		 * occurs or may be added in the future. As implemented
+		 * dsm_impl_posix_resize utilizes fallocate or truncate, passing the
+		 * whole new size as input, growing the allocation as needed (only
+		 * truncate supports shrinking). We update by replacing the old
+		 * allocation with the new.
+		 */
+#if defined(HAVE_POSIX_FALLOCATE) && defined(__linux__)
+		/*
+		 * posix_fallocate does not shrink allocations, adjust only on
+		 * allocation increase.
+		 */
+		if (request_size > *mapped_size)
+		{
+			pgstat_report_backend_mem_allocated_decrease(*mapped_size);
+

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2022-09-15 Thread Ibrar Ahmed

On Mon, Sep 12, 2022 at 8:30 PM Reid Thompson 
wrote:

> On Fri, 2022-09-09 at 12:14 -0500, Justin Pryzby wrote:
> > On Sat, Sep 03, 2022 at 11:40:03PM -0400, Reid Thompson wrote:
> > > > > +   0, 0, INT_MAX,
> > > > > +   NULL, NULL, NULL
> > > > I think this needs a maximum like INT_MAX/1024/1024
> > >
> > > Is this noting that we'd set a ceiling of 2048MB?
> >
> > The reason is that you're later multiplying it by 1024*1024, so you
> > need
> > to limit it to avoid overflowing.  Compare with
> > min_dynamic_shared_memory, Log_RotationSize, maintenance_work_mem,
> > autovacuum_work_mem.
>
> What I originally attempted to implement is:
> GUC "max_total_backend_memory" max value as INT_MAX = 2147483647 MB
> (2251799812636672 bytes). And the other variables and comparisons as
> bytes represented as uint64 to avoid overflow.
>
> Is this invalid?
>
> > typo: Explicitely
>
> corrected
>
> > +errmsg("request will exceed postgresql.conf
> > defined max_total_backend_memory limit (%lu > %lu)",
> >
> > I wouldn't mention postgresql.conf - it could be in
> > postgresql.auto.conf, or an include file, or a -c parameter.
> > Suggest: allocation would exceed max_total_backend_memory limit...
> >
>
> updated
>
> >
> > +   ereport(LOG, errmsg("decrease reduces reported
> > backend memory allocated below zero; setting reported to 0"));
> >
> > Suggest: deallocation would decrease backend memory below zero;
>
> updated
>
> > +   {"max_total_backend_memory", PGC_SIGHUP,
> > RESOURCES_MEM,
> >
> >
> >
> > Should this be PGC_SU_BACKEND to allow a superuser to set a higher
> > limit (or no limit)?
>
> Sounds good to me. I'll update to that.
> Would PGC_SUSET be too open?
>
> > There's compilation warning under mingw cross compile due to
> > sizeof(long).  See d914eb347 and other recent commits which I guess
> > is
> > the current way to handle this.
> > http://cfbot.cputube.org/reid-thompson.html
>
> updated %lu to %llu and changed cast from uint64 to
> unsigned long long in the ereport call
>
> > For performance test, you'd want to check what happens with a large
> > number of max_connections (and maybe a large number of clients).  TPS
> > isn't the only thing that matters.  For example, a utility command
> > might
> > sometimes do a lot of allocations (or deallocations), or a
> > "parameterized nested loop" may loop over over many outer tuples and
> > reset for each.  There's also a lot of places that reset to a
> > "per-tuple" context.  I started looking at its performance, but
> > nothing
> > to show yet.
>
> Thanks
>
> > Would you keep people copied on your replies ("reply all") ?
> > Otherwise
> > I (at least) may miss them.  I think that's what's typical on these
> > lists (and the list tool is smart enough not to send duplicates to
> > people who are direct recipients).
>
> Ok - will do, thanks.
>
> --
> Reid Thompson
> Senior Software Engineer
> Crunchy Data, Inc.
>
> reid.thomp...@crunchydata.com
> www.crunchydata.com
>
>
> The patch does not apply; please rebase the patch.

patching file src/backend/utils/misc/guc.c
Hunk #1 FAILED at 3664.
1 out of 1 hunk FAILED -- saving rejects to file
src/backend/utils/misc/guc.c.rej

patching file src/backend/utils/misc/postgresql.conf.sample


-- 
Ibrar Ahmed

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2022-09-12 Thread Reid Thompson

On Fri, 2022-09-09 at 12:14 -0500, Justin Pryzby wrote:
> On Sat, Sep 03, 2022 at 11:40:03PM -0400, Reid Thompson wrote:
> > > > +   0, 0, INT_MAX,
> > > > +   NULL, NULL, NULL
> > > I think this needs a maximum like INT_MAX/1024/1024
> > 
> > Is this noting that we'd set a ceiling of 2048MB?
> 
> The reason is that you're later multiplying it by 1024*1024, so you
> need
> to limit it to avoid overflowing.  Compare with
> min_dynamic_shared_memory, Log_RotationSize, maintenance_work_mem,
> autovacuum_work_mem.

What I originally attempted to implement is:
GUC "max_total_backend_memory" max value as INT_MAX = 2147483647 MB
(2251799812636672 bytes). And the other variables and comparisons as
bytes represented as uint64 to avoid overflow.

Is this invalid?

> typo: Explicitely

corrected

> +    errmsg("request will exceed postgresql.conf
> defined max_total_backend_memory limit (%lu > %lu)",
> 
> I wouldn't mention postgresql.conf - it could be in
> postgresql.auto.conf, or an include file, or a -c parameter.
> Suggest: allocation would exceed max_total_backend_memory limit...
>

updated

> 
> +   ereport(LOG, errmsg("decrease reduces reported
> backend memory allocated below zero; setting reported to 0"));
> 
> Suggest: deallocation would decrease backend memory below zero;

updated

> +   {"max_total_backend_memory", PGC_SIGHUP,
> RESOURCES_MEM,   
>  
>     
> 
> Should this be PGC_SU_BACKEND to allow a superuser to set a higher
> limit (or no limit)?

Sounds good to me. I'll update to that.
Would PGC_SUSET be too open?

> There's compilation warning under mingw cross compile due to
> sizeof(long).  See d914eb347 and other recent commits which I guess
> is
> the current way to handle this.
> http://cfbot.cputube.org/reid-thompson.html

updated %lu to %llu and changed cast from uint64 to 
unsigned long long in the ereport call

> For performance test, you'd want to check what happens with a large
> number of max_connections (and maybe a large number of clients).  TPS
> isn't the only thing that matters.  For example, a utility command
> might
> sometimes do a lot of allocations (or deallocations), or a
> "parameterized nested loop" may loop over over many outer tuples and
> reset for each.  There's also a lot of places that reset to a
> "per-tuple" context.  I started looking at its performance, but
> nothing
> to show yet.

Thanks

> Would you keep people copied on your replies ("reply all") ? 
> Otherwise
> I (at least) may miss them.  I think that's what's typical on these
> lists (and the list tool is smart enough not to send duplicates to
> people who are direct recipients).

Ok - will do, thanks.

-- 
Reid Thompson
Senior Software Engineer
Crunchy Data, Inc.

reid.thomp...@crunchydata.com
www.crunchydata.com


From 584a04f1b53948049e73165a4ffdd544c950ab0d Mon Sep 17 00:00:00 2001
From: Reid Thompson 
Date: Thu, 11 Aug 2022 12:01:25 -0400
Subject: [PATCH 1/2] Add tracking of backend memory allocated to
 pg_stat_activity

This new field displays the current bytes of memory allocated to the
backend process. It is updated as memory for the process is
malloc'd/free'd. Memory allocated to items on the freelist is included in
the displayed value.  Dynamic shared memory allocations are included
only in the value displayed for the backend that created them, they are
not included in the value for backends that are attached to them to
avoid double counting. On occasion, orphaned memory segments may be
cleaned up on postmaster startup. This may result in decreasing the sum
without a prior increment. We limit the floor of backend_mem_allocated
to zero. Updated pg_stat_activity documentation for the new column.
---
 doc/src/sgml/monitoring.sgml|  12 +++
 src/backend/catalog/system_views.sql|   1 +
 src/backend/storage/ipc/dsm_impl.c  |  80 +++
 src/backend/utils/activity/backend_status.c | 105 
 src/backend/utils/adt/pgstatfuncs.c |   4 +-
 src/backend/utils/mmgr/aset.c   |  18 
 src/backend/utils/mmgr/generation.c |  15 +++
 src/backend/utils/mmgr/slab.c   |  21 
 src/include/catalog/pg_proc.dat |   6 +-
 src/include/utils/backend_status.h  |   7 +-
 src/test/regress/expected/rules.out |   9 +-
 11 files changed, 269 insertions(+), 9 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 1d9509a2f6..40ae638f25 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -947,6 +947,18 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34   0:00 postgres: ser
  
 
 
+
+ 
+  backend_mem_allocated bigint
+ 
+ 
+  The byte count of memory allocated to this backend.

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2022-09-09 Thread Justin Pryzby

On Sat, Sep 03, 2022 at 11:40:03PM -0400, Reid Thompson wrote:
> > > +   0, 0, INT_MAX,
> > > +   NULL, NULL, NULL
> > I think this needs a maximum like INT_MAX/1024/1024
> 
> Is this noting that we'd set a ceiling of 2048MB?

The reason is that you're later multiplying it by 1024*1024, so you need
to limit it to avoid overflowing.  Compare with
min_dynamic_shared_memory, Log_RotationSize, maintenance_work_mem,
autovacuum_work_mem.

typo: Explicitely

+errmsg("request will exceed postgresql.conf defined 
max_total_backend_memory limit (%lu > %lu)",

I wouldn't mention postgresql.conf - it could be in
postgresql.auto.conf, or an include file, or a -c parameter.
Suggest: allocation would exceed max_total_backend_memory limit...

+   ereport(LOG, errmsg("decrease reduces reported backend memory 
allocated below zero; setting reported to 0"));

Suggest: deallocation would decrease backend memory below zero;

+   {"max_total_backend_memory", PGC_SIGHUP, RESOURCES_MEM, 

Should this be PGC_SU_BACKEND to allow a superuser to set a higher
limit (or no limit)?

There's compilation warning under mingw cross compile due to
sizeof(long).  See d914eb347 and other recent commits which I guess is
the current way to handle this.
http://cfbot.cputube.org/reid-thompson.html

For performance test, you'd want to check what happens with a large
number of max_connections (and maybe a large number of clients).  TPS
isn't the only thing that matters.  For example, a utility command might
sometimes do a lot of allocations (or deallocations), or a
"parameterized nested loop" may loop over over many outer tuples and
reset for each.  There's also a lot of places that reset to a
"per-tuple" context.  I started looking at its performance, but nothing
to show yet.

Would you keep people copied on your replies ("reply all") ?  Otherwise
I (at least) may miss them.  I think that's what's typical on these
lists (and the list tool is smart enough not to send duplicates to
people who are direct recipients).

-- 
Justin

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2022-09-09 Thread Stephen Frost

Greetings,

* David Rowley (dgrowle...@gmail.com) wrote:
> On Thu, 1 Sept 2022 at 04:52, Reid Thompson
>  wrote:
> > Add the ability to limit the amount of memory that can be allocated to
> > backends.
> 
> Are you aware that relcache entries are stored in backend local memory
> and that once we've added a relcache entry for a relation that we have
> no current code which attempts to reduce the memory consumption used
> by cache entries when there's memory pressure?

Short answer to this is yes, and that's an issue, but it isn't this
patch's problem to deal with- that's an issue that the relcache system
needs to be changed to address.

> It seems to me that if we had this feature as you propose that a
> backend could hit the limit and stay there just from the memory
> requirements of the relation cache after some number of tables have
> been accessed from the given backend. It's not hard to imagine a
> situation where the palloc() would start to fail during parse, which
> might make it quite infuriating for anyone trying to do something
> like:

Agreed that this could happen but I don't imagine it to be super likely-
and even if it does, this is probably a better position to be in as the
backend could then be disconnected from and would then go away and its
memory free'd, unlike the current OOM-killer situation where we crash
and go through recovery.  We should note this in the documentation
though, sure, so that administrators understand how this can occur and
can take action to address it.

> I think a better solution to this problem would be to have "memory
> grants", where we configure some amount of "pool" memory that backends
> are allowed to use for queries.  The planner would have to add the
> expected number of work_mem that the given query is expected to use
> and before that query starts, the executor would have to "checkout"
> that amount of memory from the pool and return it when finished.  If
> there is not enough memory in the pool then the query would have to
> wait until enough memory is available.   This creates a deadlocking
> hazard that the deadlock detector would need to be made aware of.

Sure, that also sounds great and a query acceptance system would be
wonderful.  If someone is working on that with an expectation of it
landing before v16, great.  Otherwise, I don't see it as relevant to
the question about if we should include this feature or not, and I'm not
even sure that we'd refuse this feature even if we already had an
acceptance system as a stop-gap should we guess wrong and not realize it
until it's too late.

Thanks,

Stephen

signature.asc
Description: PGP signature

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2022-09-06 Thread Reid Thompson

On Fri, 2022-09-02 at 09:30 +0200, Drouvot, Bertrand wrote:
> Hi,
> 
> I'm not sure we are choosing the right victims here (aka the ones
> that are doing the request that will push the total over the limit).
>
> Imagine an extreme case where a single backend consumes say 99% of
> the limit, shouldn't it be the one to be "punished"? (and somehow forced
> to give the memory back).
> 
> The problem that i see with the current approach is that a "bad"
> backend could impact all the others and continue to do so.
> 
> what about punishing say the highest consumer , what do you think?
> (just speaking about the general idea here, not about the implementation)

Initially, we believe that punishing the detector is reasonable if we
can help administrators avoid the OOM killer/resource starvation.  But
we can and should expand on this idea.

Another thought is, rather than just failing the query/transaction we
have the affected backend do a clean exit, freeing all it's resources.



-- 
Reid Thompson
Senior Software Engineer
Crunchy Data, Inc.

reid.thomp...@crunchydata.com
www.crunchydata.com

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2022-09-06 Thread Reid Thompson

On Thu, 2022-09-01 at 11:48 +0900, Kyotaro Horiguchi wrote:
> > 
> > The patch seems to limit both of memory-context allocations and DSM
> > allocations happen on a specific process by the same budget. In the
> > fist place I don't think it's sensible to cap the amount of DSM
> > allocations by per-process budget.
> > 
> > DSM is used by pgstats subsystem. There can be cases where pgstat
> > complains for denial of DSM allocation after the budget has been
> > exhausted by memory-context allocations, or every command complains
> > for denial of memory-context allocation after once the per-process
> > budget is exhausted by DSM allocations. That doesn't seem
> > reasonable.
> > regards.

It's intended as a mechanism for administrators to limit total
postgresql memory consumption to avoid the OOM killer causing a crash
and restart, or to ensure that resources are available for other
processes on shared hosts, etc. It limits all types of allocations in
order to accomplish this.  Our documentation will note this, so that
administrators that have the need to set it are aware that it can
affect all non-auxiliary processes and what the effect is.

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2022-09-03 Thread Reid Thompson

On Wed, 2022-08-31 at 12:34 -0500, Justin Pryzby wrote:
> You should name the patches with different prefixes, like
> 001,002,003  Otherwise, cfbot may try to apply them in the wrong
> order.
> git format-patch is the usual tool for that.

Thanks for the pointer. My experience with git in the past has been
minimal and basic.

> > +    Specifies a limit to the amount of memory (MB) that may be
> > allocated to
> 
> MB are just the default unit, right ?
> The user should be allowed to write max_total_backend_memory='2GB'

Correct. Default units are MB. Other unit types are converted to MB.

> > +    causing that backends current query/transaction to fail.
> 
> backend's
> > +    allocated exceeding the limit. Further requests will not
> 
> allocated, and exceed the limit
> 
> > +   if (MyAuxProcType != NotAnAuxProcess)
> The double negative is confusing, so could use a comment.

> > +   elog(WARNING,
> I think it should be ereport() rather than elog(), which is
> internal-only, and not-translated.

Corrected/added the the above items. Attached patches with the corrections.

> > +   0, 0, INT_MAX,
> > +   NULL, NULL, NULL
> I think this needs a maximum like INT_MAX/1024/1024

Is this noting that we'd set a ceiling of 2048MB?

> > +   for (i = 1; i <= NumBackendStatSlots; i++)
> > +   {
> 
> It's looping over every backend for each allocation.
> Do you know if there's any performance impact of that ?

I'm not very familiar with how to test performance impact, I'm open to
suggestions.  I have performed the below pgbench tests and noted the basic
tps differences in the table.

Test 1:
branch master
CFLAGS="-I/usr/include/python3.8/ " /home/rthompso/src/git/postgres/configure 
--silent --prefix=/home/rthompso/src/git/postgres/install/master --with-openssl 
--with-tcl --with-tclconfig=/usr/lib/tcl8.6 --with-perl --with-libxml 
--with-libxslt --with-python --with-gssapi --with-systemd --with-ldap 
--enable-nls
make -s -j12 && make -s install
initdb
default postgresql.conf settings
init pgbench pgbench -U rthompso -p 5433 -h localhost -i -s 50 testpgbench
10 iterations
for ctr in {1..10}; do { time pgbench  -p 5433 -h localhost -c 10 -j 10 -t 
5 testpgbench; } 2>&1 | tee -a pgstatsResultsNoLimitSet; done

Test 2:
branch pg-stat-activity-backend-memory-allocated
CFLAGS="-I/usr/include/python3.8/ " /home/rthompso/src/git/postgres/configure 
--silent --prefix=/home/rthompso/src/git/postgres/install/pg-stats-memory/ 
--with-openssl --with-tcl --with-tclconfig=/usr/lib/tcl8.6 --with-perl 
--with-libxml --with-libxslt --with-python --with-gssapi --with-systemd 
--with-ldap --enable-nls
make -s -j12 && make -s install
initdb 
default postgresql.conf settings
init pgbench pgbench -U rthompso -p 5433 -h localhost -i -s 50
testpgbench
10 iterations
for ctr in {1..10}; do { time pgbench  -p 5433 -h localhost -c 10  -j 10 -t 
5 testpgbench; } 2>&1 | tee -a pgstatsResultsPg-stats-memory; done

Test 3:
branch dev-max-memory
CFLAGS="-I/usr/include/python3.8/ " /home/rthompso/src/git/postgres/configure 
--silent --prefix=/home/rthompso/src/git/postgres/install/dev-max-memory/ 
--with-openssl --with-tcl --with-tclconfig=/usr/lib/tcl8.6 --with-perl 
--with-libxml --with-libxslt --with-python --with-gssapi --with-systemd 
--with-ldap --enable-nls
make -s -j12 && make -s install
initdb 
default postgresql.conf settings
init pgbench pgbench -U rthompso -p 5433 -h localhost -i -s 50 testpgbench
10 iterations
for ctr in {1..10}; do { time pgbench  -p 5433 -h localhost -c 10  -j 10 -t 
5 testpgbench; } 2>&1 | tee -a pgstatsResultsDev-max-memory; done

Test 4:
branch dev-max-memory
CFLAGS="-I/usr/include/python3.8/ " /home/rthompso/src/git/postgres/configure 
--silent --prefix=/home/rthompso/src/git/postgres/install/dev-max-memory/ 
--with-openssl --with-tcl --with-tclconfig=/usr/lib/tcl8.6 --with-perl 
--with-libxml --with-libxslt --with-python --with-gssapi --with-systemd 
--with-ldap --enable-nls
make -s -j12 && make -s install
initdb 
non-default postgresql.conf setting for max_total_backend_memory = 100MB
init pgbench pgbench -U rthompso -p 5433 -h localhost -i -s 50 testpgbench
10 iterations
for ctr in {1..10}; do { time pgbench  -p 5433 -h localhost -c 10  -j 10 -t 
5 testpgbench; } 2>&1 | tee -a pgstatsResultsDev-max-memory100MB; done

Laptop
11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz 8 Cores 16 threads
32GB RAM
SSD drive

Averages from the 10 runs and tps difference over the 10 runs
|--+--++---+--+---+---+--|
| Test Run | Master   | Track Memory Allocated | Diff from 
Master  | Max Mem off  | Diff from Master  | Max Mem 100MB | Diff from 
Master |
| Set 1| Test 1   | Test 2 |
   | Test 3   |   | Test 4|

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2022-09-02 Thread David Rowley

On Thu, 1 Sept 2022 at 04:52, Reid Thompson
wrote:
> Add the ability to limit the amount of memory that can be allocated to
> backends.

Are you aware that relcache entries are stored in backend local memory
and that once we've added a relcache entry for a relation that we have
no current code which attempts to reduce the memory consumption used
by cache entries when there's memory pressure?

It seems to me that if we had this feature as you propose that a
backend could hit the limit and stay there just from the memory
requirements of the relation cache after some number of tables have
been accessed from the given backend. It's not hard to imagine a
situation where the palloc() would start to fail during parse, which
might make it quite infuriating for anyone trying to do something
like:

SET max_total_backend_memory TO 0;

ALTER SYSTEM SET max_total_backend_memory TO 0;

I think a better solution to this problem would be to have "memory
grants", where we configure some amount of "pool" memory that backends
are allowed to use for queries. The planner would have to add the
expected number of work_mem that the given query is expected to use
and before that query starts, the executor would have to "checkout"
that amount of memory from the pool and return it when finished. If
there is not enough memory in the pool then the query would have to
wait until enough memory is available. This creates a deadlocking
hazard that the deadlock detector would need to be made aware of.

I know Thomas Munro has mentioned this "memory grant" or "memory pool"
feature to me previously and I think he even has some work in progress
code for it. It's a very tricky problem, however, as aside from the
deadlocking issue, it requires working out how much memory a given
plan will use concurrently. That's not as simple as counting the nodes
that use work_mem and summing those up.

There is some discussion about the feature in [1]. I was unable to
find what Thomas mentioned on the list about this. I've included him
here in case he has any extra information to share.

David

[1]
https://www.postgresql.org/message-id/flat/20220713222342.GE18011%40telsasoft.com#b4f526aa8f2c893567c1ecf069f9e6c7

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2022-08-31 Thread Kyotaro Horiguchi

At Wed, 31 Aug 2022 12:50:19 -0400, Reid Thompson 
 wrote in 
> Hi Hackers,
> 
> Add the ability to limit the amount of memory that can be allocated to
> backends.

The patch seems to limit both of memory-context allocations and DSM
allocations happen on a specific process by the same budget. In the
fist place I don't think it's sensible to cap the amount of DSM
allocations by per-process budget.

DSM is used by pgstats subsystem. There can be cases where pgstat
complains for denial of DSM allocation after the budget has been
exhausted by memory-context allocations, or every command complains
for denial of memory-context allocation after once the per-process
budget is exhausted by DSM allocations. That doesn't seem reasonable.


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2022-08-31 Thread Justin Pryzby

On Wed, Aug 31, 2022 at 12:50:19PM -0400, Reid Thompson wrote:
> Hi Hackers,
> 
> Add the ability to limit the amount of memory that can be allocated to
> backends.
> 
> This builds on the work that adds backend memory allocated to
> pg_stat_activity
> https://www.postgresql.org/message-id/67bb5c15c0489cb499723b0340f16e10c22485ec.camel%40crunchydata.com
> Both patches are attached.

You should name the patches with different prefixes, like
001,002,003  Otherwise, cfbot may try to apply them in the wrong order.
git format-patch is the usual tool for that.

> +Specifies a limit to the amount of memory (MB) that may be allocated 
> to

MB are just the default unit, right ?
The user should be allowed to write max_total_backend_memory='2GB'

> +backends in total (i.e. this is not a per user or per backend limit).
> +If unset, or set to 0 it is disabled.  A backend request that would 
> push
> +the total over the limit will be denied with an out of memory error
> +causing that backends current query/transaction to fail. Due to the 
> dynamic

backend's

> +nature of memory allocations, this limit is not exact. If within 
> 1.5MB of
> +the limit and two backends request 1MB each at the same time both 
> may be
> +allocated exceeding the limit. Further requests will not be 
> allocated until

allocated, and exceed the limit

> +bool
> +exceeds_max_total_bkend_mem(uint64 allocation_request)
> +{
> + bool result = false;
> +
> + if (MyAuxProcType != NotAnAuxProcess)
> + return result;

The double negative is confusing, so could use a comment.

> + /* Convert max_total_bkend_mem to bytes for comparison */
> + if (max_total_bkend_mem &&
> + pgstat_get_all_backend_memory_allocated() +
> + allocation_request > (uint64)max_total_bkend_mem * 1024 * 1024)
> + {
> + /*
> +  * Explicitely identify the OOM being a result of this
> +  * configuration parameter vs a system failure to allocate OOM.
> +  */
> + elog(WARNING,
> +  "request will exceed postgresql.conf defined 
> max_total_backend_memory limit (%lu > %lu)",
> +  pgstat_get_all_backend_memory_allocated() +
> +  allocation_request, (uint64)max_total_bkend_mem * 1024 
> * 1024);

I think it should be ereport() rather than elog(), which is
internal-only, and not-translated.

> +   {"max_total_backend_memory", PGC_SIGHUP, RESOURCES_MEM,
> +   gettext_noop("Restrict total backend memory 
> allocations to this max."),
> +   gettext_noop("0 turns this feature off."),
> +   GUC_UNIT_MB
> +   },
> +   _total_bkend_mem,
> +   0, 0, INT_MAX,
> +   NULL, NULL, NULL

I think this needs a maximum like INT_MAX/1024/1024

> +uint64
> +pgstat_get_all_backend_memory_allocated(void)
> +{
...
> + for (i = 1; i <= NumBackendStatSlots; i++)
> + {

It's looping over every backend for each allocation.
Do you know if there's any performance impact of that ?

I think it may be necessary to track the current allocation size in
shared memory (with atomic increments?).  Maybe decrements would need to
be exactly accounted for, or otherwise Assert() that the value is not
negative.  I don't know how expensive it'd be to have conditionals for
each decrement, but maybe the value would only be decremented at
strategic times, like at transaction commit or backend shutdown.

-- 
Justin

60 matches

Mail list logo