Hi,

Thanks for your comments to v1 patch.

I made v2 patch. Main changes are
* change to NAMEDATALEN
* change to use dynahash from dshash
* remove worker_spi_init()
* create second hash table to find a event id from a name to
  identify uniquness. It enable extensions which don't use share
  memory for their use to define custom wait events because
  WaitEventExtensionNew() will not allocate duplicate wait events.
* create PoC patch to show that extensions, which don't use shared
  memory for their use, can define custom wait events.
 (v2-0002-poc-custom-wait-event-for-dblink.patch)

I'm worrying about
* Is 512(wee_hash_max_size) the maximum number of the custom wait
  events sufficient?
* Is there any way to not force extensions that don't use shared
  memory for their use like dblink to acquire AddinShmemInitLock?;

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION
From ec7d2b0ae47be9bedabb4a4127c914bfb8c361b5 Mon Sep 17 00:00:00 2001
From: Masahiro Ikeda <mshr.ik...@ntt.com>
Date: Wed, 9 Aug 2023 19:19:58 +0900
Subject: [PATCH 1/2] Change to manage custom wait events in shared hash

Currently, names of the custom wait event must be registered
per backends. This patch relaxes the constraints by store the
wait events and their names in hash tables in shared memory.

So, all backends can look up the wait event names on
pg_stat_activity view without additional processing by extensions.

In addition, extensions which do not use shared memory for their
use can define custom wait events because WaitEventExtensionNew()
will never allocate new one if the wait event associated to the
name is already allocated. It use hash table to identify uniquness.
---
 doc/src/sgml/monitoring.sgml                  |   7 +-
 doc/src/sgml/xfunc.sgml                       |  10 +-
 src/backend/storage/lmgr/lwlocknames.txt      |   1 +
 src/backend/utils/activity/wait_event.c       | 198 +++++++++++-------
 src/include/utils/wait_event.h                |  17 +-
 .../modules/worker_spi/t/001_worker_spi.pl    |  18 +-
 .../modules/worker_spi/worker_spi--1.0.sql    |   5 -
 src/test/modules/worker_spi/worker_spi.c      |  22 +-
 8 files changed, 137 insertions(+), 141 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index f4fc5d814f..19181832d7 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -1121,10 +1121,9 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
      <literal>LWLock</literal> types
      to the list shown in <xref linkend="wait-event-extension-table"/> and
      <xref linkend="wait-event-lwlock-table"/>. In some cases, the name
-     assigned by an extension will not be available in all server processes;
-     so an <literal>Extension</literal> or <literal>LWLock</literal> wait
-     event might be reported as just
-     <quote><literal>extension</literal></quote> rather than the
+     of <literal>LWLock</literal> assigned by an extension will not be
+     available in all server processes; so an wait event might be reported
+     as just <quote><literal>extension</literal></quote> rather than the
      extension-assigned name.
     </para>
    </note>
diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index d6345a775b..7fec034db4 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -3470,17 +3470,13 @@ void RequestAddinShmemSpace(int size)
 </programlisting>
     </para>
     <para>
-     <literal>shmem_startup_hook</literal> can allocate in shared memory
+     <literal>shmem_startup_hook</literal> can allocate in dynamic shared memory
      custom wait events by calling while holding the LWLock
      <function>AddinShmemInitLock</function> to avoid any race conditions:
 <programlisting>
-uint32 WaitEventExtensionNew(void)
-</programlisting>
-     Next, each process needs to associate the wait event allocated previously
-     to a user-facing custom string, which is something done by calling:
-<programlisting>
-void WaitEventExtensionRegisterName(uint32 wait_event_info, const char *wait_event_name)
+uint32 WaitEventExtensionNew(const char *wait_event_name)
 </programlisting>
+     The wait event is associated to a user-facing custom string.
      An example can be found in <filename>src/test/modules/worker_spi</filename>
      in the PostgreSQL source tree.
     </para>
diff --git a/src/backend/storage/lmgr/lwlocknames.txt b/src/backend/storage/lmgr/lwlocknames.txt
index b34b6afecd..7af3e0ba1a 100644
--- a/src/backend/storage/lmgr/lwlocknames.txt
+++ b/src/backend/storage/lmgr/lwlocknames.txt
@@ -53,3 +53,4 @@ XactTruncationLock					44
 # 45 was XactTruncationLock until removal of BackendRandomLock
 WrapLimitsVacuumLock				46
 NotifyQueueTailLock					47
+WaitEventExtensionLock              48
diff --git a/src/backend/utils/activity/wait_event.c b/src/backend/utils/activity/wait_event.c
index b3596ece80..15b2dd06cc 100644
--- a/src/backend/utils/activity/wait_event.c
+++ b/src/backend/utils/activity/wait_event.c
@@ -45,6 +45,42 @@ uint32	   *my_wait_event_info = &local_my_wait_event_info;
 #define WAIT_EVENT_CLASS_MASK	0xFF000000
 #define WAIT_EVENT_ID_MASK		0x0000FFFF
 
+/*
+ * Hash tables for storing custom wait event ids and their names in
+ * shared memory.
+ *
+ * WaitEventExtensionNameHash is used to find the name from a event id.
+ * It enables all backends look up them without additional processing
+ * per backend like LWLockRegisterTranche().
+ *
+ * WaitEventExtensionIdHash is used to find the event id from a name.
+ * Since it can identify uniquness by the names, extensions that do not
+ * use shared memory also be able to define custom wait events without
+ * defining duplicate wait events.
+ *
+ * The size of hash table is based on assumption. In most of the case,
+ * 16 seem to be sufficient. It seems unlikely that the number of entries
+ * will reach 512.
+ */
+static HTAB *WaitEventExtensionNameHash;	/* find Ids from names */
+static HTAB *WaitEventExtensionIdHash;		/* find names from Ids */
+static const int wee_hash_init_size = 128;
+static const int wee_hash_max_size = 512;
+
+/* hash table entres */
+typedef struct WaitEventExtensionNameEntry
+{
+	uint16	event_id;	/* hash key */
+	char	wait_event_name[NAMEDATALEN]; /* custom wait event name */
+} WaitEventExtensionNameEntry;
+
+typedef struct WaitEventExtensionIdEntry
+{
+	char	wait_event_name[NAMEDATALEN]; /* hash key */
+	uint16	event_id;	/* wait event id */
+} WaitEventExtensionIdEntry;
+
+
 /* dynamic allocation counter for custom wait events in extensions */
 typedef struct WaitEventExtensionCounterData
 {
@@ -59,36 +95,35 @@ static WaitEventExtensionCounterData *WaitEventExtensionCounter;
 #define NUM_BUILTIN_WAIT_EVENT_EXTENSION	\
 	(WAIT_EVENT_EXTENSION_FIRST_USER_DEFINED - WAIT_EVENT_EXTENSION)
 
-/*
- * This is indexed by event ID minus NUM_BUILTIN_WAIT_EVENT_EXTENSION, and
- * stores the names of all dynamically-created event IDs known to the current
- * process.  Any unused entries in the array will contain NULL.
- */
-static const char **WaitEventExtensionNames = NULL;
-static int	WaitEventExtensionNamesAllocated = 0;
-
 static const char *GetWaitEventExtensionIdentifier(uint16 eventId);
 
 /*
- *  Return the space for dynamic allocation counter.
+ *  Return the space for dynamic shared hash tables and dynamic allocation counter.
  */
 Size
 WaitEventExtensionShmemSize(void)
 {
-	return sizeof(WaitEventExtensionCounterData);
+	Size	sz;
+	sz = MAXALIGN(sizeof(WaitEventExtensionCounterData));
+	sz = add_size(sz, hash_estimate_size(wee_hash_init_size,
+										sizeof(WaitEventExtensionNameEntry)));
+	sz = add_size(sz, hash_estimate_size(wee_hash_init_size,
+										sizeof(WaitEventExtensionIdEntry)));
+	return sz;
 }
 
 /*
- * Allocate shmem space for dynamic allocation counter.
+ * Allocate shmem space for dynamic shared hash and dynamic allocation counter.
  */
 void
 WaitEventExtensionShmemInit(void)
 {
 	bool		found;
+	HASHCTL		info;
 
 	WaitEventExtensionCounter = (WaitEventExtensionCounterData *)
 		ShmemInitStruct("WaitEventExtensionCounterData",
-						WaitEventExtensionShmemSize(), &found);
+						MAXALIGN(sizeof(WaitEventExtensionCounterData)), &found);
 
 	if (!found)
 	{
@@ -96,21 +131,63 @@ WaitEventExtensionShmemInit(void)
 		WaitEventExtensionCounter->nextId = NUM_BUILTIN_WAIT_EVENT_EXTENSION;
 		SpinLockInit(&WaitEventExtensionCounter->mutex);
 	}
+
+	/* initialize or atatch the hash tables to store custom wait events */
+	info.keysize = sizeof(uint16);
+	info.entrysize = sizeof(WaitEventExtensionNameEntry);
+	WaitEventExtensionNameHash = ShmemInitHash("Wait Event Extension name hash",
+					wee_hash_init_size,
+					wee_hash_max_size,
+					&info,
+					HASH_ELEM | HASH_BLOBS);
+
+	info.keysize = sizeof(char[NAMEDATALEN]);
+	info.entrysize = sizeof(WaitEventExtensionIdEntry);
+	WaitEventExtensionIdHash = ShmemInitHash("Wait Event Extension id hash",
+					wee_hash_init_size,
+					wee_hash_max_size,
+					&info,
+					HASH_ELEM | HASH_BLOBS);
 }
 
 /*
- * Allocate a new event ID and return the wait event.
+ * Allocate a new event ID and return the wait event info.
+ *
+ * If the wait event name is already defined, this don't
+ * allocate new one, but it return the wait event info
+ * associated to the name.
  */
 uint32
-WaitEventExtensionNew(void)
+WaitEventExtensionNew(const char *wait_event_name)
 {
 	uint16		eventId;
+	bool		found;
+	WaitEventExtensionIdEntry	*id_entry;
+	WaitEventExtensionNameEntry	*name_entry;
 
 	Assert(LWLockHeldByMeInMode(AddinShmemInitLock, LW_EXCLUSIVE));
 
+	/* Check the limit of the length of the event name */
+	if (strlen(wait_event_name) >= NAMEDATALEN)
+		ereport(ERROR,
+				errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+				errmsg("wait event name is too long"));
+
+	/*
+	 * First, check the wait event info associated to the name is
+	 * already defined. Return it if so.
+	 */
+	LWLockAcquire(WaitEventExtensionLock, LW_SHARED);
+	id_entry = (WaitEventExtensionIdEntry *)
+				hash_search(WaitEventExtensionIdHash, &wait_event_name, HASH_FIND, &found);
+	LWLockRelease(WaitEventExtensionLock);
+	if (found)
+		return PG_WAIT_EXTENSION | id_entry->event_id;
+
+	/* Allocate a new event Id */
 	SpinLockAcquire(&WaitEventExtensionCounter->mutex);
 
-	if (WaitEventExtensionCounter->nextId > PG_UINT16_MAX)
+	if (WaitEventExtensionCounter->nextId >= wee_hash_max_size)
 	{
 		SpinLockRelease(&WaitEventExtensionCounter->mutex);
 		ereport(ERROR,
@@ -122,64 +199,22 @@ WaitEventExtensionNew(void)
 
 	SpinLockRelease(&WaitEventExtensionCounter->mutex);
 
-	return PG_WAIT_EXTENSION | eventId;
-}
+	/* Register the new custom wait event in the shared hash table */
+	LWLockAcquire(WaitEventExtensionLock, LW_EXCLUSIVE);
 
-/*
- * Register a dynamic wait event name for extension in the lookup table
- * of the current process.
- *
- * This routine will save a pointer to the wait event name passed as an argument,
- * so the name should be allocated in a backend-lifetime context
- * (shared memory, TopMemoryContext, static constant, or similar).
- *
- * The "wait_event_name" will be user-visible as a wait event name, so try to
- * use a name that fits the style for those.
- */
-void
-WaitEventExtensionRegisterName(uint32 wait_event_info,
-							   const char *wait_event_name)
-{
-	uint32		classId;
-	uint16		eventId;
+	name_entry = (WaitEventExtensionNameEntry *)
+				hash_search(WaitEventExtensionNameHash, &eventId, HASH_ENTER, &found);
+	Assert(!found);
+	strlcpy(name_entry->wait_event_name, wait_event_name, sizeof(name_entry->wait_event_name));
 
-	classId = wait_event_info & WAIT_EVENT_CLASS_MASK;
-	eventId = wait_event_info & WAIT_EVENT_ID_MASK;
+	id_entry = (WaitEventExtensionIdEntry *)
+				hash_search(WaitEventExtensionIdHash, &wait_event_name, HASH_ENTER, &found);
+	Assert(!found);
+	id_entry->event_id = eventId;
 
-	/* Check the wait event class. */
-	if (classId != PG_WAIT_EXTENSION)
-		ereport(ERROR,
-				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-				errmsg("invalid wait event class %u", classId));
+	LWLockRelease(WaitEventExtensionLock);
 
-	/* This should only be called for user-defined wait event. */
-	if (eventId < NUM_BUILTIN_WAIT_EVENT_EXTENSION)
-		ereport(ERROR,
-				errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-				errmsg("invalid wait event ID %u", eventId));
-
-	/* Convert to array index. */
-	eventId -= NUM_BUILTIN_WAIT_EVENT_EXTENSION;
-
-	/* If necessary, create or enlarge array. */
-	if (eventId >= WaitEventExtensionNamesAllocated)
-	{
-		uint32		newalloc;
-
-		newalloc = pg_nextpower2_32(Max(8, eventId + 1));
-
-		if (WaitEventExtensionNames == NULL)
-			WaitEventExtensionNames = (const char **)
-				MemoryContextAllocZero(TopMemoryContext,
-									   newalloc * sizeof(char *));
-		else
-			WaitEventExtensionNames =
-				repalloc0_array(WaitEventExtensionNames, const char *,
-								WaitEventExtensionNamesAllocated, newalloc);
-		WaitEventExtensionNamesAllocated = newalloc;
-	}
-
-	WaitEventExtensionNames[eventId] = wait_event_name;
+	return PG_WAIT_EXTENSION | eventId;
 }
 
 /*
@@ -188,23 +223,28 @@ WaitEventExtensionRegisterName(uint32 wait_event_info,
 static const char *
 GetWaitEventExtensionIdentifier(uint16 eventId)
 {
+	bool found;
+	WaitEventExtensionNameEntry	*entry;
+
 	/* Built-in event? */
 	if (eventId < NUM_BUILTIN_WAIT_EVENT_EXTENSION)
 		return "Extension";
 
+	/* It is a user-defined wait event, so lookup hash table. */
+	LWLockAcquire(WaitEventExtensionLock, LW_SHARED);
+	entry = (WaitEventExtensionNameEntry *)
+				hash_search(WaitEventExtensionNameHash, &eventId, HASH_FIND, &found);
+	LWLockRelease(WaitEventExtensionLock);
+
 	/*
-	 * It is a user-defined wait event, so look at WaitEventExtensionNames[].
-	 * However, it is possible that the name has never been registered by
-	 * calling WaitEventExtensionRegisterName() in the current process, in
-	 * which case give up and return "extension".
+	 * The entry must be stored because it's registered in
+	 * WaitEventExtensionNew().
 	 */
-	eventId -= NUM_BUILTIN_WAIT_EVENT_EXTENSION;
-
-	if (eventId >= WaitEventExtensionNamesAllocated ||
-		WaitEventExtensionNames[eventId] == NULL)
-		return "extension";
+	if (!entry)
+		ereport(ERROR,
+				errmsg("could not find the name for custom wait event ID %u", eventId));
 
-	return WaitEventExtensionNames[eventId];
+	return entry->wait_event_name;
 }
 
 
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h
index aad8bc08fa..51a0b83b79 100644
--- a/src/include/utils/wait_event.h
+++ b/src/include/utils/wait_event.h
@@ -44,12 +44,13 @@ extern PGDLLIMPORT uint32 *my_wait_event_info;
  * Use this category when the server process is waiting for some condition
  * defined by an extension module.
  *
- * Extensions can define their own wait events in this category.  First,
- * they should call WaitEventExtensionNew() to get one or more wait event
- * IDs that are allocated from a shared counter.  These can be used directly
- * with pgstat_report_wait_start() or equivalent.  Next, each individual
- * process should call WaitEventExtensionRegisterName() to associate a wait
- * event string to the number allocated previously.
+ * Extensions can define their own wait events in this category. They should
+ * call WaitEventExtensionNew() with a wait event string. If the wait event
+ * associated the string is already allocated, it returns the wait event info.
+ * If not, it will get one wait event ID that is allocated from a shared counter,
+ * associate the string to the number in the shared dynamic hash and return the
+ * wait event info. The string can be used directly with pgstat_report_wait_start()
+ * or equivalent.
  */
 typedef enum
 {
@@ -60,9 +61,7 @@ typedef enum
 extern void WaitEventExtensionShmemInit(void);
 extern Size WaitEventExtensionShmemSize(void);
 
-extern uint32 WaitEventExtensionNew(void);
-extern void WaitEventExtensionRegisterName(uint32 wait_event_info,
-										   const char *wait_event_name);
+extern uint32 WaitEventExtensionNew(const char *wait_event_name);
 
 /* ----------
  * pgstat_report_wait_start() -
diff --git a/src/test/modules/worker_spi/t/001_worker_spi.pl b/src/test/modules/worker_spi/t/001_worker_spi.pl
index c3e7f5fbe6..26b8a49bec 100644
--- a/src/test/modules/worker_spi/t/001_worker_spi.pl
+++ b/src/test/modules/worker_spi/t/001_worker_spi.pl
@@ -39,25 +39,11 @@ $node->poll_query_until('postgres',
 $result = $node->safe_psql('postgres', 'SELECT * FROM schema4.counted;');
 is($result, qq(total|1), 'dynamic bgworker correctly consumed tuple data');
 
-# Check the wait event used by the dynamic bgworker.  For a session without
-# the state in shared memory known, the default of "extension" is the value
-# waited on.
+# Check the wait event used by the dynamic bgworker.
 $result = $node->poll_query_until(
 	'postgres',
 	qq[SELECT wait_event FROM pg_stat_activity WHERE backend_type ~ 'worker_spi';],
-	'extension');
-is($result, 1, 'dynamic bgworker has reported "extension" as wait event');
-
-# If the shared memory state is loaded (here with worker_spi_init within
-# the same connection as the one querying pg_stat_activity), the wait
-# event is the custom one.
-# The expected result is a special pattern here with a newline coming from the
-# first query where the shared memory state is set.
-$result = $node->poll_query_until(
-	'postgres',
-	qq[SELECT worker_spi_init(); SELECT wait_event FROM pg_stat_activity WHERE backend_type ~ 'worker_spi';],
-	qq[
-worker_spi_main]);
+	qq[worker_spi_main]);
 is($result, 1,
 	'dynamic bgworker has reported "worker_spi_main" as wait event');
 
diff --git a/src/test/modules/worker_spi/worker_spi--1.0.sql b/src/test/modules/worker_spi/worker_spi--1.0.sql
index f13f7e0f98..e9d5b07373 100644
--- a/src/test/modules/worker_spi/worker_spi--1.0.sql
+++ b/src/test/modules/worker_spi/worker_spi--1.0.sql
@@ -7,8 +7,3 @@ CREATE FUNCTION worker_spi_launch(pg_catalog.int4)
 RETURNS pg_catalog.int4 STRICT
 AS 'MODULE_PATHNAME'
 LANGUAGE C;
-
-CREATE FUNCTION worker_spi_init()
-RETURNS VOID STRICT
-AS 'MODULE_PATHNAME'
-LANGUAGE C;
diff --git a/src/test/modules/worker_spi/worker_spi.c b/src/test/modules/worker_spi/worker_spi.c
index c4317351ce..ee1ce13d50 100644
--- a/src/test/modules/worker_spi/worker_spi.c
+++ b/src/test/modules/worker_spi/worker_spi.c
@@ -44,7 +44,6 @@
 
 PG_MODULE_MAGIC;
 
-PG_FUNCTION_INFO_V1(worker_spi_init);
 PG_FUNCTION_INFO_V1(worker_spi_launch);
 
 PGDLLEXPORT void worker_spi_main(Datum main_arg) pg_attribute_noreturn();
@@ -124,14 +123,10 @@ worker_spi_shmem_init(void)
 
 	/* Define a new wait event */
 	if (!found)
-		wsstate->wait_event = WaitEventExtensionNew();
+		wsstate->wait_event = WaitEventExtensionNew("worker_spi_main");
 
 	LWLockRelease(AddinShmemInitLock);
 
-	/*
-	 * Register the wait event in the lookup table of the current process.
-	 */
-	WaitEventExtensionRegisterName(wsstate->wait_event, "worker_spi_main");
 	return;
 }
 
@@ -434,21 +429,6 @@ _PG_init(void)
 	}
 }
 
-/*
- * Wrapper to initialize a session with the shared memory state
- * used by this module.  This is a convenience routine to be able to
- * see the custom wait event stored in shared memory without loading
- * through shared_preload_libraries.
- */
-Datum
-worker_spi_init(PG_FUNCTION_ARGS)
-{
-	/* Create (if necessary) and attach to our shared memory area. */
-	worker_spi_shmem_init();
-
-	PG_RETURN_VOID();
-}
-
 /*
  * Dynamically launch an SPI worker.
  */
-- 
2.25.1

From 13bfa30d804d84502fd5de40c9ab735869a731d8 Mon Sep 17 00:00:00 2001
From: Masahiro Ikeda <mshr.ik...@ntt.com>
Date: Wed, 9 Aug 2023 19:40:02 +0900
Subject: [PATCH 2/2] poc: custom wait event for dblink

---
 contrib/dblink/dblink.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 41e1f6c91d..1b3dd3736a 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -129,6 +129,7 @@ static void restoreLocalGucs(int nestlevel);
 /* Global */
 static remoteConn *pconn = NULL;
 static HTAB *remoteConnHash = NULL;
+static uint32 *wait_event_info = NULL;
 
 /*
  *	Following is list that holds multiple remote connections.
@@ -203,7 +204,14 @@ dblink_get_conn(char *conname_or_str,
 		dblink_connstr_check(connstr);
 
 		/* OK to make connection */
-		conn = libpqsrv_connect(connstr, WAIT_EVENT_EXTENSION);
+		if (wait_event_info == NULL)
+		{
+			wait_event_info = (uint32 *) MemoryContextAlloc(TopMemoryContext, sizeof(uint32));
+			LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
+			*wait_event_info = WaitEventExtensionNew("dblink_get_con");
+			LWLockRelease(AddinShmemInitLock);
+		}
+		conn = libpqsrv_connect(connstr, *wait_event_info);
 
 		if (PQstatus(conn) == CONNECTION_BAD)
 		{
-- 
2.25.1

Reply via email to