Hi, Under a normal environment, the instance's number of huge pages can be adjusted to the size reported by shared_memory_size_in_huge_pages, then Postgres can be started and the requested shared memory fit in the available huge pages.
A similar approach is harder to implement with environments like kubernetes. If I want to modify the huge pages on a pod, I need to: - Modify the host's huge pages - Restart the host's kubelet so it detects the new amount of huge pages - Modify the pod's huge page request Most of those steps are far from practical. An alternative would be to have a fixed number of huge pages (like 25% of the node's memory), and to adjust the configuration, like the amount of shared_buffers. However, adjusting the configuration to fit in a fixed amount of memory is tricky: - shared_buffers is used to auto-tune multiple parameters so there's no easy formula to get the correct amount. The only way I've found is to basically increase shared_buffers until shared_memory_size_in_huge_pages matches the desired amount of huge pages - changing other parameters like max_connections mean shared_buffers has to be adjusted again To help with that, the attached patch provides a new option, huge_pages_autotune_buffers, to automatically use leftover huge pages as shared_buffers. This requires some changes in the auto-tune logic: - Subsystems that are using shared_buffers for auto-tuning will rely on the configured shared_buffers, not the auto-tuned shared_buffers and they should save the auto-tuned value in a GUC. This will be done in dedicated auto-tune functions. - Once the auto-tune functions are called, modifying NBuffers won't change the requested memory except for the shared buffer pool in BufferManagerShmemSize - We can get the leftover memory (free huge pages - requested memory), and estimate how much shared_buffers we can add - Increasing shared_buffers will also increase the freelist hashmap, so the auto-tuned shared_buffers needs to be reduced The patch is split in the following sub-patches: 0001: Extract the current auto-tune logic in dedicated functions, making the behaviour more consistent across subsystems. 0002: The checkpointer auto-tunes the request size using NBuffers, but doesn't save the result in a GUC. This adds a new checkpoint_request_size GUC with the same auto-tune logic. 0003: Extract HugePages_Free value when /proc/meminfo is parsed in GetHugePageSize. 0004: Pass NBuffers as parameters to StrategyShmemSize. This is necessary to get how much memory will be used by the freelist using 'StrategyShmemSize(candidate_nbuffers) - StrategyShmemSize(NBuffers)'. 0005: Add BufferManagerAutotune to auto-tune the amount of shared_buffers. Regards, Anthonin Bonnefoy
v1-0003-Extract-HugePages_Free-value-in-GetHugePageSize.patch
Description: Binary data
v1-0004-Pass-NBuffers-as-parameter-to-StrategyShmemSize.patch
Description: Binary data
v1-0005-Auto-tune-shared_buffers-to-use-available-huge-pa.patch
Description: Binary data
v1-0002-Add-GUC-for-checkpointer-request-queue-size.patch
Description: Binary data
v1-0001-Create-dedicated-shmem-Autotune-functions.patch
Description: Binary data
