On Fri, Feb 13, 2026 at 1:42 AM Andres Freund <[email protected]> wrote:
>
> Hi,
>
> On 2026-02-10 11:02:13 +0530, Ashutosh Bapat wrote:
> > > I still don't see what the point of having multiple mappings and using 
> > > memfd
> > > is. We need to reserve the address space for the maximum sized allocation 
> > > in
> > > postmaster, otherwise there's absolutely no guarantee that it's available 
> > > at
> > > those addresses in all the children - which you do as you explain
> > > here. Therefore, the maximum size of each "suballocation" needs to be 
> > > reserved
> > > ahead of time. At which point I don't see the point of having multiple
> > > mmaps. It just makes things more complicated and expensive (each mmap 
> > > makes
> > > fork & exit slower).
> > >
> > > Even if we decide to use memfd, because we consider MADV_DONTNEED to not 
> > > be
> > > suitable for some reason, what's the point of having more than one mapping
> > > using memfd?
>
> (this should reference MADV_REMOVE, not MADV_DONTNEED)
>
>
> > There are just two mappings now compared to 6 earlier. If I am reading
> > Jakub's benchmarking correctly, even 6 segments didn't show much
> > regression in his benchmarks. Having just two should not see much
> > regression. If we use multiple mappings we could control the
> > properties of each segment separately - e.g. use huge pages for some
> > (buffer blocks) and not for others. In Windows it seems it is easy to
> > create multiple segments than punching holes in an existing segments.
> > When we port the feature to Windows or other platforms, being able to
> > treat all the segments in the same way would be an advantage.
>
> > Said that I am not discarding the idea of using a single fd and then
> > punching holes using fallocate() altogether; we will use it if
> > multiple mappings do not bring any advantages. Let's also see how the
> > on-demand shared memory segment feature being discussed in this thread
> > with Heikki gets shaped.
>
> I think the multiple memory mappings approach is just too restrictive. If we
> e.g. eventually want to make some of the other major allocations that depend
> on NBuffers react to resizing shared buffers, it's very easy to do if all it
> requires is calling
>    madvise(TYPEALIGN(start, page_size), MADV_REMOVE, TYPEALIGN_DOWN(end, 
> page_size));
>
> There are several cases that are pretty easy to handle that way:
> - Buffer Blocks
> - Buffer Descriptors
> - Sync request queue (part of the "Checkpointer Data" allocation)
> - Checkpoint BufferIds (for sorting the to-be-checkpointed data)
> - Buffer IO Condition Variables
>
> But if you want to support making these resizable with the separate mappings
> approach, it gets considerably more complicated and the number of mappings
> increases more substantially.
>
> We also don't need a lot less infrastructure in shmem.c that way. We could
> e.g. make ShmemInitStruct() reservere the entire requested size (to avoid OOM
> killer issues) and have a ShmemInitStructExt() that allows the caller choose
> whether to reserve. No different segment IDs etc are needed.

I have started a new thread to discuss resizable shared memory
structures [1]. Copied this discussion over there. We will come back
to this thread once the discussion there is settled and discuss
specifically resizable buffer pool.

[1] 
https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52uij3z4vkd3aj72x8fw6k3k...@mail.gmail.com

--
Best Wishes,
Ashutosh Bapat


Reply via email to