On Thu, Aug 31, 2023 at 9:19 AM John Morris <john.mor...@crunchydata.com>
wrote:

> Here is an updated version of the earlier work.
>
> This version:
>    1) Tracks memory as requested by the backend.
>    2) Includes allocations made during program startup.
>    3) Optimizes the "fast path" to only update two local variables.
>    4) Places a cluster wide limit on total memory allocated.
>
> The cluster wide limit is useful for multi-hosting. One greedy cluster
> doesn't starve
> the other clusters of memory.
>
> Note there isn't a good way to track actual memory used by a cluster.
> Ideally, we like to get the working set size of each memory segment along
> with
> the size of the associated kernel data structures.
> Gathering that info in a portable way is a "can of worms".
> Instead, we're managing memory as requested by the application.
> While not identical, the two approaches are strongly correlated.
>
>  The memory model used is
>    1) Each process is assumed to use a certain amount of memory
>        simply by existing.
>    2) All pg memory allocations are counted, including those before
>        the process is fully initialized.
>     3) Each process maintains its own local counters. These are the
> "truth".
>      4) Periodically,
>         -  local counters are added into the global, shared memory
> counters.
>          - pgstats is updated
>          - total memory is checked.
>
> For efficiency, the global total is an approximation, not a precise number.
> It can be off by as much as 1 MB per process. Memory limiting
> doesn't need precision, just a consistent and reasonable approximation.
>
> Repeating the earlier benchmark test, there is no measurable loss of
> performance.
>
> Hi,
In `InitProcGlobal`:

+            elog(WARNING, "proc init: max_total=%d  result=%d\n",
max_total_bkend_mem, result);

Is the above log for debugging purposes ? Maybe the log level should be
changed.

+
errmsg("max_total_backend_memory %dMB - shared_memory_size %dMB is <=
100MB",

The `<=` in the error message is inconsistent with the `max_total_bkend_mem
< result + 100` check.
Please modify one of them.

For update_global_allocation :

+
Assert((int64)pg_atomic_read_u64(&ProcGlobal->total_bkend_mem_bytes) >= 0);
+
Assert((int64)pg_atomic_read_u64(&ProcGlobal->global_dsm_allocation) >= 0);

Should the assertions be done earlier in the function?

For reserve_backend_memory:

+               return true;
+
+       /* CASE: the new allocation is within bounds. Take the fast path. */
+       else if (my_memory.allocated_bytes + size <= allocation_upper_bound)

The `else` can be omitted (the preceding if block returns).

For `atomic_add_within_bounds_i64`

+               newval = oldval + add;
+
+               /* check if we are out of bounds */
+               if (newval < lower_bound || newval > upper_bound ||
addition_overflow(oldval, add))

Since the summation is stored in `newval`, you can pass newval to
`addition_overflow` so that `addition_overflow` doesn't add them again.

There are debug statements, such as:

+       debug("done\n");

you can drop them in the next patch.

Thanks

Reply via email to