Hi Andres, Hi all,

First, thank you for your feedback!

Please find in attachment a patch implementing accumulated wait event stats
only from the backend point of view. As I wrote when I reviewed and rebased the
existing patch, I was uncomfortable with the global approach. I still volunteer
to work/review the original approach is required.

See bellow for comments and some more explanations about what I think might be
improvements over the previous patch.

On Fri, 11 Jun 2021 12:18:07 -0700
Andres Freund <and...@anarazel.de> wrote:

> On 2021-06-05 00:53:44 +0200, Jehan-Guillaume de Rorthais wrote:
> > From 88c2779679c5c9625ca5348eec0543daab5ccab4 Mon Sep 17 00:00:00 2001
> > From: Jehan-Guillaume de Rorthais <j...@dalibo.com>
> > Date: Tue, 1 Jun 2021 13:25:57 +0200
> > Subject: [PATCH 1/2] Add pg_stat_waitaccum view.
> > 
> > pg_stat_waitaccum shows counts and duration of each wait events.
> > Each backend/backgrounds counts and measures the time of wait event
> > in every pgstat_report_wait_start and pgstat_report_wait_end. They
> > store those info into their local variables and send to Statistics
> > Collector. We can get those info via Statistics Collector.
> > 
> > For reducing overhead, I implemented statistic hash instead of
> > dynamic hash. I also implemented track_wait_timing which
> > determines wait event duration is collected or not.  
> 
> I object to adding this overhead. The whole selling point for wait
> events was that they are low overhead. I since spent time reducing the
> overhead further, because even just the branches for checking if
> track_activity is enabled are measurable (225a22b19ed).

Agree. The previous patch I rebased was to review it and reopen this discussion,
I even added a small FIXME in pgstat_report_wait_end and
pgstat_report_wait_start about your work:

  //FIXME: recent patch to speed up this call.

In the patch in attachment, I tried to fix this by using kind of an internal
hook for pgstat_report_wait_start and pgstat_report_wait_end. This allows to
"instrument" wait events only when required, on the fly, dynamically.

Moreover, I removed the hash structure for a simple static array for faster
access.

> > From ddb1adc5cd9acc9bc9de16d0cf057124b09fe1e3 Mon Sep 17 00:00:00 2001
> > From: Jehan-Guillaume de Rorthais <j...@dalibo.com>
> > Date: Fri, 4 Jun 2021 18:14:51 +0200
> > Subject: [PATCH 2/2] [POC] Change measuring method of wait event time from
> >  INSTR_TIME to rdtsc.
> > 
> > This patch changes measuring method of wait event time from INSTR_TIME
> > (which uses gettimeofday or clock_gettime) to rdtsc. This might reduce the
> > overhead of measuring overhead.
> > 
> > Any supports like changing clock cycle to actual time or error handling are
> > not currently implemented.  
> 
> rdtsc is a serializing (*) instruction - that's the expensive part. On linux
> clock_gettime() doesn't actually need a syscall. While the vdso call
> implies a bit of overhead over a raw rdtsc, it's a relatively small part
> of the overhead.  See
> https://www.postgresql.org/message-id/20200612232810.f46nbqkdhbutzqdg%40alap3.anarazel.de

I choose to remove all this rdtsc part from my patch as this wasn't clear how
much faster it was compare to simpler vdso functions and how to accurately
extract a human time.


About my take on $subject, for the sake of simplicity of this PoC, I added
instrumentation to log_statement_stats. Despite the query context of the
reported log, they are really accumulated stats.

The patch updated pg_stat_get_waitaccum() as well to be able to report the
accumulated wait events from your interactive or batch session.

So using my previous fe-time demo client, you can test it using:

  PGOPTIONS="--log_statement_stats=on" ./fe-time 100

From logs, I now have (notice the last line):

  LOG:  duration: 3837.194 ms  statement: SELECT * FROM pgbench_accounts
  LOG:  QUERY STATISTICS
  DETAIL:  ! system usage stats:
           !       0.087444 s user, 0.002106 s system, 3.837202 s elapsed
           !       [0.087444 s user, 0.003974 s system total]
           !       25860 kB max resident size
           !       0/0 [0/0] filesystem blocks in/out
           !       0/303 [0/697] page faults/reclaims, 0 [0] swaps
           !       0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
           !       4/18 [5/18] voluntary/involuntary context switches
           !       Client/ClientWrite 4 calls, 3747102 us elapsed


Using pgbench scale factor 10, the copy query for pgbench_accounts looks like:

  LOG:  duration: 2388.081 ms  statement: copy pgbench_accounts from stdin
  LOG:  QUERY STATISTICS
  DETAIL:  ! system usage stats:
           !       1.373756 s user, 0.252860 s system, 2.388100 s elapsed
           !       [1.397015 s user, 0.264951 s system total]
           !       37788 kB max resident size
           !       0/641584 [0/642056] filesystem blocks in/out
           !       194/4147 [195/4728] page faults/reclaims, 0 [0] swaps
           !       0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
           !       3263/92 [3390/102] voluntary/involuntary context switches
           !       LWLock/WALBufMapping 3 calls, 654 us elapsed
           !       LWLock/WALWrite 3 calls, 60680 us elapsed
           !       LWLock/CheckpointerComm 3 calls, 318 us elapsed
           !       Client/ClientRead 1 calls, 151 us elapsed
           !       IO/DataFileExtend 16397 calls, 94000 us elapsed
           !       IO/DataFileWrite 14346 calls, 45911 us elapsed
           !       IO/WALInitSync 6 calls, 334840 us elapsed
           !       IO/WALInitWrite 6 calls, 48040 us elapsed
           !       IO/WALSync 17 calls, 353334 us elapsed
           !       IO/WALWrite 8362 calls, 38401 us elapsed

Now, I'm on a fence about the user facing interaction. I'll keep thinking about
this and report this week. In the meantime, any feedback about the current
implementation backbone is welcome.

Thanks!

Regards,
>From e3d1f39138babfebd28a92f4dd8136eeb173d2c2 Mon Sep 17 00:00:00 2001
From: Jehan-Guillaume de Rorthais <j...@dalibo.com>
Date: Fri, 11 Jun 2021 18:17:47 +0200
Subject: [PATCH v1] PoC: Trace wait events to logfile when
 log_statement_stats=on

---
 src/backend/storage/lmgr/lwlock.c       |  19 +++
 src/backend/storage/lmgr/proc.c         |   2 +
 src/backend/tcop/postgres.c             |  48 ++++++-
 src/backend/utils/activity/wait_event.c | 158 ++++++++++++++++++++++++
 src/backend/utils/adt/pgstatfuncs.c     |  74 +++++++++++
 src/include/catalog/pg_proc.dat         |   9 ++
 src/include/storage/lock.h              |   1 +
 src/include/storage/lwlock.h            |   1 +
 src/include/utils/wait_event.h          |  77 ++++++++++++
 9 files changed, 387 insertions(+), 2 deletions(-)

diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index 55b9d7970e..e9a120805b 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -634,6 +634,25 @@ LWLockNewTrancheId(void)
 	return result;
 }
 
+/*
+ * Get a last tranche ID.
+ */
+int
+LWLockGetLastTrancheId(void)
+{
+	int			result;
+	int		   *LWLockCounter;
+
+	Assert(!lock_named_request_allowed);
+
+	LWLockCounter = (int *) ((char *) MainLWLockArray - sizeof(int));
+	SpinLockAcquire(ShmemLock);
+	result = *LWLockCounter;
+	SpinLockRelease(ShmemLock);
+
+	return result;
+}
+
 /*
  * Register a dynamic tranche name in the lookup table of the current process.
  *
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 2575ea1ca0..52891c26a9 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -450,6 +450,8 @@ InitProcess(void)
 
 	/* now that we have a proc, report wait events to shared memory */
 	pgstat_set_wait_event_storage(&MyProc->wait_event_info);
+	/* init wait event tracking structure*/
+	pgstat_init_waitaccums();
 
 	/*
 	 * We might be reusing a semaphore that belonged to a failed process. So
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 8cea10c901..25500e8f76 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -207,7 +207,6 @@ static void log_disconnections(int code, Datum arg);
 static void enable_statement_timeout(void);
 static void disable_statement_timeout(void);
 
-
 /* ----------------------------------------------------------------
  *		routines to obtain user input
  * ----------------------------------------------------------------
@@ -978,7 +977,10 @@ exec_simple_query(const char *query_string)
 	 * results because ResetUsage wasn't called.
 	 */
 	if (save_log_statement_stats)
+	{
 		ResetUsage();
+		pgstat_set_report_waits();
+	}
 
 	/*
 	 * Start up a transaction command.  All queries generated by the
@@ -1308,7 +1310,10 @@ exec_simple_query(const char *query_string)
 	}
 
 	if (save_log_statement_stats)
+	{
 		ShowUsage("QUERY STATISTICS");
+		pgstat_reset_report_waits();
+	}
 
 	TRACE_POSTGRESQL_QUERY_DONE(query_string);
 
@@ -1346,7 +1351,10 @@ exec_parse_message(const char *query_string,	/* string to execute */
 	set_ps_display("PARSE");
 
 	if (save_log_statement_stats)
+	{
 		ResetUsage();
+		pgstat_set_report_waits();
+	}
 
 	ereport(DEBUG2,
 			(errmsg_internal("parse %s: %s",
@@ -1567,7 +1575,10 @@ exec_parse_message(const char *query_string,	/* string to execute */
 	}
 
 	if (save_log_statement_stats)
+	{
 		ShowUsage("PARSE MESSAGE STATISTICS");
+		pgstat_reset_report_waits();
+	}
 
 	debug_query_string = NULL;
 }
@@ -1637,7 +1648,10 @@ exec_bind_message(StringInfo input_message)
 	set_ps_display("BIND");
 
 	if (save_log_statement_stats)
+	{
 		ResetUsage();
+		pgstat_set_report_waits();
+	}
 
 	/*
 	 * Start up a transaction command so we can call functions etc. (Note that
@@ -2029,7 +2043,10 @@ exec_bind_message(StringInfo input_message)
 	}
 
 	if (save_log_statement_stats)
+	{
 		ShowUsage("BIND MESSAGE STATISTICS");
+		pgstat_reset_report_waits();
+	}
 
 	debug_query_string = NULL;
 }
@@ -2122,7 +2139,10 @@ exec_execute_message(const char *portal_name, long max_rows)
 	set_ps_display(GetCommandTagName(portal->commandTag));
 
 	if (save_log_statement_stats)
+	{
 		ResetUsage();
+		pgstat_set_report_waits();
+	}
 
 	BeginCommand(portal->commandTag, dest);
 
@@ -2269,7 +2289,10 @@ exec_execute_message(const char *portal_name, long max_rows)
 	}
 
 	if (save_log_statement_stats)
+	{
 		ShowUsage("EXECUTE MESSAGE STATISTICS");
+		pgstat_reset_report_waits();
+	}
 
 	debug_query_string = NULL;
 }
@@ -3914,7 +3937,6 @@ process_postgres_switches(int argc, char *argv[], GucContext ctx,
 #endif
 }
 
-
 /* ----------------------------------------------------------------
  * PostgresMain
  *	   postgres main loop -- all backends, interactive or otherwise start here
@@ -4795,10 +4817,13 @@ void
 ShowUsage(const char *title)
 {
 	StringInfoData str;
+	WaitAccumEntry *entry = wa_events.events;
 	struct timeval user,
 				sys;
 	struct timeval elapse_t;
 	struct rusage r;
+	bool		save_log_statement_stats = log_statement_stats;
+	int		i;
 
 	getrusage(RUSAGE_SELF, &r);
 	gettimeofday(&elapse_t, NULL);
@@ -4879,6 +4904,25 @@ ShowUsage(const char *title)
 					 r.ru_nvcsw, r.ru_nivcsw);
 #endif							/* HAVE_GETRUSAGE */
 
+	if (save_log_statement_stats)
+	{
+		for (i = 0; i < wa_events.num_events; i++, entry++)
+		{
+			const char *wait_event_type = NULL;
+			const char *wait_event = NULL;
+
+			if (entry->calls < 1)
+				continue;
+
+			wait_event_type = pgstat_get_wait_event_type(entry->wait_event_info);
+			wait_event = pgstat_get_wait_event(entry->wait_event_info);
+
+			appendStringInfo(&str, "!\t%s/%s %lu calls, %lu us elapsed\n",
+					 wait_event_type, wait_event, entry->calls,
+					 INSTR_TIME_GET_MICROSEC(entry->times));
+		}
+	}
+
 	/* remove trailing newline */
 	if (str.data[str.len - 1] == '\n')
 		str.data[--str.len] = '\0';
diff --git a/src/backend/utils/activity/wait_event.c b/src/backend/utils/activity/wait_event.c
index 6baf67740c..61dd125341 100644
--- a/src/backend/utils/activity/wait_event.c
+++ b/src/backend/utils/activity/wait_event.c
@@ -32,11 +32,127 @@ static const char *pgstat_get_wait_client(WaitEventClient w);
 static const char *pgstat_get_wait_ipc(WaitEventIPC w);
 static const char *pgstat_get_wait_timeout(WaitEventTimeout w);
 static const char *pgstat_get_wait_io(WaitEventIO w);
+static inline void accum_pgstat_report_wait_start();
+static inline void accum_pgstat_report_wait_end();
 
 
 static uint32 local_my_wait_event_info;
 uint32	   *my_wait_event_info = &local_my_wait_event_info;
 
+WaitEvents wa_events;
+instr_time waitEventStart;
+report_wait_start_function	my_pgstat_report_wait_start = standard_pgstat_report_wait_start;
+report_wait_end_function	my_pgstat_report_wait_end = standard_pgstat_report_wait_end;
+
+static WaitAccumEntry *
+pgstat_get_wait_entry(uint32 wait_event_info)
+{
+	uint32		classId;
+	uint16		eventId;
+	WaitAccumEntry *entry = NULL;
+
+	/* report process as not waiting. */
+	if (wait_event_info == 0)
+		return NULL;
+
+	classId = wait_event_info & 0xFF000000;
+	eventId = wait_event_info & 0x0000FFFF;
+
+	switch (classId)
+	{
+		case PG_WAIT_LWLOCK:
+			entry = wa_events.wa_lwlock + eventId;
+			break;
+		case PG_WAIT_LOCK:
+			entry = wa_events.wa_lock + eventId;
+			break;
+		case PG_WAIT_BUFFER_PIN:
+			entry = wa_events.wa_buffer_pin + eventId;
+			break;
+		case PG_WAIT_ACTIVITY:
+			entry = wa_events.wa_activity + eventId;
+			break;
+		case PG_WAIT_CLIENT:
+			entry = wa_events.wa_client + eventId;
+			break;
+		// FIXME
+		//  case PG_WAIT_EXTENSION:
+		//  	break;
+		case PG_WAIT_IPC:
+			entry = wa_events.wa_ipc + eventId;
+			break;
+		case PG_WAIT_TIMEOUT:
+			entry = wa_events.wa_timeout + eventId;
+			break;
+		case PG_WAIT_IO:
+			entry = wa_events.wa_io + eventId;
+			break;
+		default:
+			break;
+	}
+
+	return entry;
+}
+
+void
+pgstat_init_waitaccums(void)
+{
+	WaitAccumEntry *event;
+	int num_lwlocks;
+	int i;
+
+	num_lwlocks = LWLockGetLastTrancheId() +1;
+	wa_events.num_events = num_lwlocks
+		 + PG_WAIT_ACTIVITY_NUM
+		 + PG_WAIT_CLIENT_NUM
+		 + PG_WAIT_IPC_NUM
+		 + PG_WAIT_TIMEOUT_NUM
+		 + PG_WAIT_IO_NUM
+		 + LOCKTAG_NUM
+		 + PG_WAIT_BUFFER_PIN_NUM;
+
+	wa_events.events = MemoryContextAllocZero(TopMemoryContext,
+						  wa_events.num_events
+						  * sizeof(WaitAccumEntry));
+
+	event = wa_events.events;
+
+	// FIXME: support extensions
+
+	wa_events.wa_lwlock = event;
+	for (i = 0; i < num_lwlocks; i++, event++)
+		event->wait_event_info = PG_WAIT_LWLOCK|i;
+
+	wa_events.wa_lock = event;
+	for (i = 0; i < LOCKTAG_NUM; i++, event++)
+		event->wait_event_info = PG_WAIT_LOCK|i;
+
+	wa_events.wa_buffer_pin = event;
+	for (i = 0; i < PG_WAIT_BUFFER_PIN_NUM; i++, event++)
+		event->wait_event_info = PG_WAIT_BUFFER_PIN|i;
+
+	wa_events.wa_activity = event;
+	for (i = 0; i < PG_WAIT_ACTIVITY_NUM; i++, event++)
+		event->wait_event_info = PG_WAIT_ACTIVITY|i;
+
+	wa_events.wa_client = event;
+	for (i = 0; i < PG_WAIT_CLIENT_NUM; i++, event++)
+		event->wait_event_info = PG_WAIT_CLIENT|i;
+
+	wa_events.wa_ipc = event;
+	for (i = 0; i < PG_WAIT_IPC_NUM; i++, event++)
+		event->wait_event_info = PG_WAIT_IPC|i;
+
+	wa_events.wa_timeout = event;
+	for (i = 0; i < PG_WAIT_TIMEOUT_NUM; i++, event++)
+		event->wait_event_info = PG_WAIT_TIMEOUT|i;
+
+	wa_events.wa_io = event;
+	for (i = 0; i < PG_WAIT_IO_NUM; i++, event++)
+		event->wait_event_info = PG_WAIT_IO|i;
+
+	pgstat_reset_report_waits();
+}
 
 /*
  * Configure wait event reporting to report wait events to *wait_event_info.
@@ -732,3 +848,45 @@ pgstat_get_wait_io(WaitEventIO w)
 
 	return event_name;
 }
+
+static inline void
+accum_pgstat_report_wait_start(uint32 wait_event_info)
+{
+	standard_pgstat_report_wait_start(wait_event_info);
+
+	INSTR_TIME_SET_CURRENT(waitEventStart);
+}
+
+static inline void
+accum_pgstat_report_wait_end(void)
+{
+	WaitAccumEntry *entry;
+	instr_time  diff;
+
+	INSTR_TIME_SET_CURRENT(diff);
+	INSTR_TIME_SUBTRACT(diff, waitEventStart);
+
+	entry = pgstat_get_wait_entry(*my_wait_event_info);
+
+	if (entry)
+	{
+		entry->calls++;
+		INSTR_TIME_ADD(entry->times, diff);
+	}
+
+	standard_pgstat_report_wait_end();
+}
+
+void
+pgstat_set_report_waits(void)
+{
+	my_pgstat_report_wait_start = accum_pgstat_report_wait_start;
+	my_pgstat_report_wait_end = accum_pgstat_report_wait_end;
+}
+
+void
+pgstat_reset_report_waits(void)
+{
+	my_pgstat_report_wait_start = standard_pgstat_report_wait_start;
+	my_pgstat_report_wait_end = standard_pgstat_report_wait_end;
+}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 14056f5347..cf4db57bff 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -2380,3 +2380,77 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
 	/* Returns the record as Datum */
 	PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
 }
+
+Datum
+pg_stat_get_waitaccum(PG_FUNCTION_ARGS)
+{
+#define PG_STAT_GET_WAITACCUM_COLS	4
+	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+	TupleDesc	tupdesc;
+	Tuplestorestate *tupstore;
+	MemoryContext per_query_ctx;
+	MemoryContext oldcontext;
+	WaitAccumEntry *entry = wa_events.events;
+	int i;
+
+	/* check to see if caller supports us returning a tuplestore */
+	if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("set-valued function called in context that cannot accept a set")));
+	if (!(rsinfo->allowedModes & SFRM_Materialize))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("materialize mode required, but it is not " \
+						"allowed in this context")));
+
+	/* Build a tuple descriptor for our result type */
+	if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+		elog(ERROR, "return type must be a row type");
+
+	per_query_ctx = rsinfo->econtext->ecxt_per_query_memory;
+	oldcontext = MemoryContextSwitchTo(per_query_ctx);
+
+	tupstore = tuplestore_begin_heap(true, false, work_mem);
+	rsinfo->returnMode = SFRM_Materialize;
+	rsinfo->setResult = tupstore;
+	rsinfo->setDesc = tupdesc;
+
+	MemoryContextSwitchTo(oldcontext);
+
+	for (i = 0; i < wa_events.num_events; i++, entry++)
+	{
+		Datum		values[PG_STAT_GET_WAITACCUM_COLS];
+		bool		nulls[PG_STAT_GET_WAITACCUM_COLS];
+		const char *wait_event_type = NULL;
+		const char *wait_event = NULL;
+
+		/* Initialise values and NULL flags arrays */
+		MemSet(values, 0, sizeof(values));
+		MemSet(nulls, 0, sizeof(nulls));
+
+		/* Fill values and NULLs */
+		{
+			uint32		raw_wait_event;
+
+			raw_wait_event = UINT32_ACCESS_ONCE(entry->wait_event_info);
+			wait_event_type = pgstat_get_wait_event_type(raw_wait_event);
+			wait_event = pgstat_get_wait_event(raw_wait_event);
+		}
+
+		values[0] = CStringGetTextDatum(wait_event_type);
+
+		values[1] = CStringGetTextDatum(wait_event);
+
+		values[2] = Int64GetDatum(entry->calls);
+
+		values[3] = UInt64GetDatum(INSTR_TIME_GET_MICROSEC(entry->times));
+
+		tuplestore_putvalues(tupstore, tupdesc, values, nulls);
+	}
+
+	/* clean up and return the tuplestore */
+	tuplestore_donestoring(tupstore);
+
+	return (Datum) 0;
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index acbcae4607..b066863a97 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5284,6 +5284,15 @@
   proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
   proargnames => '{pid,datid,pid,usesysid,application_name,state,query,wait_event_type,wait_event,xact_start,query_start,backend_start,state_change,client_addr,client_hostname,client_port,backend_xid,backend_xmin,backend_type,ssl,sslversion,sslcipher,sslbits,ssl_client_dn,ssl_client_serial,ssl_issuer_dn,gss_auth,gss_princ,gss_enc,leader_pid,query_id}',
   prosrc => 'pg_stat_get_activity' },
+{ oid => '8316',
+  descr => 'statistics: accumulated wait event stats',
+  proname => 'pg_stat_get_waitaccum', prorows => '200', proisstrict => 'f',
+  proretset => 't', provolatile => 'v', proparallel => 'r',
+  prorettype => 'record', proargtypes => '',
+  proallargtypes => '{text,text,int8,int8}',
+  proargmodes => '{o,o,o,o}',
+  proargnames => '{wait_event_type,wait_event,calls,times}',
+  prosrc => 'pg_stat_get_waitaccum' },
 { oid => '3318',
   descr => 'statistics: information about progress of backends running maintenance command',
   proname => 'pg_stat_get_progress_info', prorows => '100', proretset => 't',
diff --git a/src/include/storage/lock.h b/src/include/storage/lock.h
index 9b2a421c32..885664fa54 100644
--- a/src/include/storage/lock.h
+++ b/src/include/storage/lock.h
@@ -152,6 +152,7 @@ typedef enum LockTagType
 } LockTagType;
 
 #define LOCKTAG_LAST_TYPE	LOCKTAG_ADVISORY
+#define	LOCKTAG_NUM 		LOCKTAG_LAST_TYPE
 
 extern const char *const LockTagTypeNames[];
 
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index a8f052e484..e6638b2c45 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -154,6 +154,7 @@ extern LWLockPadded *GetNamedLWLockTranche(const char *tranche_name);
  * registration in the main shared memory segment wouldn't work for that case.
  */
 extern int	LWLockNewTrancheId(void);
+extern int	LWLockGetLastTrancheId(void);
 extern void LWLockRegisterTranche(int tranche_id, const char *tranche_name);
 extern void LWLockInitialize(LWLock *lock, int tranche_id);
 
diff --git a/src/include/utils/wait_event.h b/src/include/utils/wait_event.h
index 6c6ec2e711..9da4b714e2 100644
--- a/src/include/utils/wait_event.h
+++ b/src/include/utils/wait_event.h
@@ -10,6 +10,8 @@
 #ifndef WAIT_EVENT_H
 #define WAIT_EVENT_H
 
+#include "utils/memutils.h"
+#include "portability/instr_time.h"
 
 /* ----------
  * Wait Classes
@@ -25,6 +27,40 @@
 #define PG_WAIT_TIMEOUT				0x09000000U
 #define PG_WAIT_IO					0x0A000000U
 
+/* ----------
+ * WaitAccumEntry	Entry in backend/background's per-wait_event_info hash table
+ * ----------
+ */
+typedef struct WaitAccumEntry
+{
+	uint32		wait_event_info;
+	uint64		calls;
+	instr_time	times;
+} WaitAccumEntry;
+
+typedef struct WaitEvents
+{
+	uint32		num_events;
+	WaitAccumEntry *events;
+	WaitAccumEntry *wa_lwlock;
+	WaitAccumEntry *wa_lock;
+	WaitAccumEntry *wa_buffer_pin;
+	WaitAccumEntry *wa_activity;
+	WaitAccumEntry *wa_client;
+	// FIXME WaitAccumEntry *wa_extension;
+	WaitAccumEntry *wa_ipc;
+	WaitAccumEntry *wa_timeout;
+	WaitAccumEntry *wa_io;
+} WaitEvents;
+
+extern WaitEvents wa_events;
+
+/*
+ * There's only one entry in PG_WAIT_BUFFER_PIN class:
+ * PG_WAIT_BUFFER_PIN itself
+ */
+#define PG_WAIT_BUFFER_PIN_NUM 1
+
 /* ----------
  * Wait Events - Activity
  *
@@ -50,6 +86,9 @@ typedef enum
 	WAIT_EVENT_WAL_WRITER_MAIN
 } WaitEventActivity;
 
+#define	PG_WAIT_ACTIVITY_LAST_TYPE	WAIT_EVENT_WAL_WRITER_MAIN
+#define	PG_WAIT_ACTIVITY_NUM ( WAIT_EVENT_WAL_WRITER_MAIN - PG_WAIT_ACTIVITY )
+
 /* ----------
  * Wait Events - Client
  *
@@ -70,6 +109,9 @@ typedef enum
 	WAIT_EVENT_WAL_SENDER_WRITE_DATA,
 } WaitEventClient;
 
+#define	PG_WAIT_CLIENT_LAST_TYPE	WAIT_EVENT_WAL_SENDER_WRITE_DATA
+#define	PG_WAIT_CLIENT_NUM ( PG_WAIT_CLIENT_LAST_TYPE - PG_WAIT_CLIENT )
+
 /* ----------
  * Wait Events - IPC
  *
@@ -128,6 +170,9 @@ typedef enum
 	WAIT_EVENT_XACT_GROUP_UPDATE
 } WaitEventIPC;
 
+#define	PG_WAIT_IPC_LAST_TYPE	WAIT_EVENT_XACT_GROUP_UPDATE
+#define	PG_WAIT_IPC_NUM ( PG_WAIT_IPC_LAST_TYPE - PG_WAIT_IPC )
+
 /* ----------
  * Wait Events - Timeout
  *
@@ -143,6 +188,9 @@ typedef enum
 	WAIT_EVENT_VACUUM_DELAY
 } WaitEventTimeout;
 
+#define	PG_WAIT_TIMEOUT_LAST_TYPE	WAIT_EVENT_VACUUM_DELAY
+#define	PG_WAIT_TIMEOUT_NUM ( PG_WAIT_TIMEOUT_LAST_TYPE - PG_WAIT_TIMEOUT )
+
 /* ----------
  * Wait Events - IO
  *
@@ -227,14 +275,30 @@ typedef enum
 	WAIT_EVENT_LOGICAL_SUBXACT_WRITE
 } WaitEventIO;
 
+#define	PG_WAIT_IO_LAST_TYPE	WAIT_EVENT_LOGICAL_SUBXACT_WRITE
+#define	PG_WAIT_IO_NUM ( PG_WAIT_IO_LAST_TYPE - PG_WAIT_IO )
+
+/* track start time of current wait event */
+extern instr_time waitEventStart;
+
+void pgstat_init_waitaccums(void);
 
 extern const char *pgstat_get_wait_event(uint32 wait_event_info);
 extern const char *pgstat_get_wait_event_type(uint32 wait_event_info);
 static inline void pgstat_report_wait_start(uint32 wait_event_info);
+static inline void standard_pgstat_report_wait_start(uint32 wait_event_info);
 static inline void pgstat_report_wait_end(void);
+static inline void standard_pgstat_report_wait_end(void);
+extern void pgstat_set_report_waits(void);
+extern void pgstat_reset_report_waits(void);
 extern void pgstat_set_wait_event_storage(uint32 *wait_event_info);
 extern void pgstat_reset_wait_event_storage(void);
 
+typedef void (*report_wait_start_function) (uint32 wait_event_info);
+extern PGDLLIMPORT report_wait_start_function my_pgstat_report_wait_start;
+typedef void (*report_wait_end_function) ();
+extern PGDLLIMPORT report_wait_end_function my_pgstat_report_wait_end;
+
 extern PGDLLIMPORT uint32 *my_wait_event_info;
 
 
@@ -259,6 +323,12 @@ extern PGDLLIMPORT uint32 *my_wait_event_info;
  */
 static inline void
 pgstat_report_wait_start(uint32 wait_event_info)
+{
+	(*my_pgstat_report_wait_start)(wait_event_info);
+}
+
+static inline void
+standard_pgstat_report_wait_start(uint32 wait_event_info)
 {
 	/*
 	 * Since this is a four-byte field which is always read and written as
@@ -275,6 +345,13 @@ pgstat_report_wait_start(uint32 wait_event_info)
  */
 static inline void
 pgstat_report_wait_end(void)
+{
+	(*my_pgstat_report_wait_end)();
+}
+
+
+static inline void
+standard_pgstat_report_wait_end(void)
 {
 	/* see pgstat_report_wait_start() */
 	*(volatile uint32 *) my_wait_event_info = 0;
-- 
2.20.1

Reply via email to