On Sat, Mar 14, 2026 at 2:39 AM Heikki Linnakangas <[email protected]> wrote:
>
> On 06/03/2026 16:12, Heikki Linnakangas wrote:
> > Firstly, I'm not sure what to do with ShmemRegisterHash() and the
> > 'HASHCTL *infoP' argument to it. I feel it'd be nicer if the HASHCTL was
> > just part of the ShmemHashDesc struct, but I'm not sure if that fits all
> > the callers. I'll have to try that out I guess.
> I took a stab at that, and it turned out to be straightforward. I'm not
> sure why I hesitated on that earlier.
>

Yeah. I wondered about that too.

> Here's a new version with that change, and a ton of little comment
> cleanups and such.

Here are initial comments on these patches.

0001

@@ -3236,6 +3239,8 @@ PostmasterStateMachine(void)
LocalProcessControlFile(true);
/* re-create shared memory and semaphores */
+ ResetShmemAllocator();

This name is misleading. The function does not touch ShmemAllocator at
all. Instead it resets the ShmemStruct registry. I suggest
ResetShememRegistry() instead.

+ *
+ * There are two kinds of shared memory data structures: fixed-size structures
+ * and hash tables.

In future we will have resizable "fixed" structures and we may also
have resizable hash tables i.e. hash tables whose directory would be
resizable. The later would be help support resizable shared buffers
lookup table. It will be good to write the above sentence so that we
can just add more types of data structures without needing to rewrite
everything. If we could find a good term for "fixed-size structures"
which are really "structures that require contiguous memory", we will
be able to write the above sentence as "There are two kinds of shared
memory structures: contiguous structures and hash tables.". When we
add resizable structures, we can just add a sentence "A contiguous
structure may be fixed size or resizable". When we add resizable hash
tables, we can just replace that with "Both of these kinds can be
fixed-size or resizable". I am not sure whether "contiguous
structures" is a good term though (for one, word contiguous can be
confused with continuous). Whatever term we use should be something
that we can carry further in the remaining paragraphs.

+ Fixed-size structures contain things like global
+ * variables for a module and should never be allocated after the shared
+ * memory initialization phase.

I think the existing comment is not accurate. The term "global
variables" in the sentence can be confused with process global
variables. We should be using the term "shared variables" or better
"shared state". If we adopt "contiguous structures" as the term for
the first kind of data structure, we can write "Contiguous structures
contain shared state, maintained in a contiguous chunk of memory, for
a module. It should never be allocated after the shared memory
initialization phase.".

+ * postmaster calls ShmemInitRegistered(), which calls the init_fn callbacks
+ * of each registered area, in the order that they were registered.

... calls the init_fn, if any, of each registered area ....

- infoP->dsize = infoP->max_dsize = hash_select_dirsize(max_size);
- infoP->alloc = ShmemAllocNoError;
- hash_flags |= HASH_SHARED_MEM | HASH_ALLOC | HASH_DIRSIZE;
+ desc->hash_info.dsize = desc->hash_info.max_dsize =
hash_select_dirsize(desc->max_size);
+ desc->hash_info.alloc = ShmemAllocNoError;
+ desc->hash_flags |= HASH_SHARED_MEM | HASH_ALLOC | HASH_DIRSIZE;
/* look it up in the shmem index */

The next several lines of code look up shmem index. Should we remove
this comments or modify it to say "Register and initialize the hash
table".

+HTAB *
+ShmemInitHash(const char *name, /* table string name for shmem index */
+ int64 init_size, /* initial table size */
+ int64 max_size, /* max size of the table */
+ HASHCTL *infoP, /* info about key and bucket size */
+ int hash_flags) /* info about infoP */
+{
+ ShmemHashDesc *desc;
+

... snip ...

+
+ ShmemRegisterHash(desc);
+ return *desc->ptr;
+}

I like the way these functions are written using the new API. I think
we should keep these legacy interface at the end of section of shmem
APIs, rather than keeping those at the end of the file where we have
monitoring and arithmetic functions. If you want to get rid of the
legacy APIs in this release itself, I think it's ok to keep them at
the end of the file.

ShmemInitStruct() now calls ShmemRegisterStruct(). Earlier it could be
called from any backend, in any state to fetch a pointer to a shared
memory structure. It didn't add a new structure. Now it can add a new
structure. I am wondering whether that can cause registry in different
backends to get out of sync. Should we limit the window when it can be
called just like how shmem_request_hook call is limited. In that sense
ShmemRegisterStruct() looks something to be called from a
shmem_register_hook which is also called from EXEC_BACKEND. Sorry to
expand it here rather in my previous reply. In case we replace all the
current calls to ShmemInitStruct() with ShmemRegisterStruct(), we may
be able to get rid of the Shmem Index altogether; after all it's used
only for fetching the pointers to the shared memory areas in
EXEC_BACKEND mode. I thought that we could save the registry in the
shared memory. In EXEC_BACKEND mode, we go over the registry calling
attach_fn for each entry. But since the binary is overwritten in
EXEC_BACKEND case, attach/init fns are not guaranteed to have the same
address in all the backends. Maybe we have to resort to
launch_backend() to transfer the registry to the backend through the
file (to keep it in sycn in all the backends): a solution you may not
like.

+ void **ptr;
+} ShmemStructDesc;


I think the comments for each member should highlight which of these
fields are required (non-zero) and which can be optional (zero'ed
out).

+ */
+ ShmemStructDesc base_desc;

Once we have calculated the base_desc in ShmemRegisterHash() and
called ShmemRegisterStruct(), we don't need base_desc anymore. Even
the pointer to the allocated hash table memory is available through
*ptr. Probably we could just remove this member from here.
ShmemRegisterHash() can declare a variable of type ShmemStructDesc,
populate it based on the members in this structure and pass it to
ShmemRegisterStruct(). I am not comfortable with specification
structure being modified by the registration function.

0003
- pages = (size / FPM_PAGE_SIZE) - first_page;
- FreePageManagerPut(fpm, first_page, pages);
- }
+ ShmemRegisterStruct(&dsm_main_space_shmem_desc);

Shouldn't we be setting dsm_main_space_shmem_desc.size here to size
before calling ShmemRegisterStruct()?

@@ -102,15 +102,14 @@ CalculateShmemSize(void)
size = add_size(size, ShmemRegisteredSize());
size = add_size(size, dsm_estimate_size());

We have defined dsm_main_space_shmem_desc, but we still use
dsm_estimate_size() here and initialize the memory in
dsm_shmem_init(), which is explicitily called from
CreateOrAttachShmemStructs(). Why is that? Shouldn't we be registering
the structure in RegisterShmemStructs(), and let ShmemInitRegistered()
initialize it? Am I missing something here?

I will continue to review the patches further.

-- 
Best Wishes,
Ashutosh Bapat


Reply via email to