Horiguchi-san,

Thanks a lot for taking a look!

On 2016/03/07 13:02, Kyotaro HORIGUCHI wrote:
> At Sat, 5 Mar 2016 16:41:29 +0900, Amit Langote wrote:
>> On Sat, Mar 5, 2016 at 4:24 PM, Amit Langote <amitlangot...@gmail.com> wrote:
>>> So, I took the Vinayak's latest patch and rewrote it a little
>> ...
>>> I broke it into two:
>>>
>>> 0001-Provide-a-way-for-utility-commands-to-report-progres.patch
>>> 0002-Implement-progress-reporting-for-VACUUM-command.patch
>>
>> Oops, unamended commit messages in those patches are misleading.  So,
>> please find attached corrected versions.
> 
> The 0001-P.. adds the following interface functions.
> 
> +extern void pgstat_progress_set_command(BackendCommandType cmdtype);
> +extern void pgstat_progress_set_command_target(Oid objid);
> +extern void pgstat_progress_update_param(int index, uint32 val);
> +extern void pgstat_reset_local_progress(void);
> +extern int   pgstat_progress_get_num_param(BackendCommandType cmdtype);
> 
> I don't like to treat the target object id differently from other
> parameters. It could not be needed at all, or could be needed two
> or more in contrast. Although oids are not guaranteed to fit
> uint32, we have already stored BlockNumber there.

I thought giving cmdtype and objid each its own slot would make things a
little bit clearer than stuffing them into st_progress_param[0] and
st_progress_param[1], respectively.  Is that what you are suggesting?
Although as I've don, a separate field st_command_objid may be a bit too much.

If they are not special fields, I think we don't need special interface
functions *set_command() and *set_command_target().  But I am still
inclined toward keeping the former.

> # I think that integer arrays might be needed to be passed as a
> # parameter, but it would be the another issue.

Didn't really think about it.  Maybe we should consider a scenario that
would require it.

> pg_stat_get_progress_info returns a tuple with 10 integer columns
> (plus an object id). The reason why I suggested use of an integer
> array is that it allows the API to serve arbitrary number of
> parmeters without a modification of API, and array indexes are
> coloreless than any concrete names. Howerver I don't stick to
> that if we agree that it is ok to have fixed number of paremters.

I think the fixed number of parameters in the form of a fixed-size array
is because st_progress_param[] is part of a shared memory structure as
discussed before.  Although such interface has been roughly modeled on how
pg_statistic catalog and pg_stats view or get_attstatsslot() function
work, shared memory structures take the place of the catalog, so there are
some restrictions (fixed size array being one).

Regarding index into st_progress_param[], pgstat.c/pgstatfuncs.c should
not bother what it is.  As exemplified in patch 0002, individual index
numbers can be defined as macros by individual command modules (suggested
by Robert recently) with certain convention for readability such as the
following in lazyvacuum.c:

#define PROG_PAR_VAC_RELID                     0
#define PROG_PAR_VAC_PHASE_ID                  1
#define PROG_PAR_VAC_HEAP_BLKS                 2
#define PROG_PAR_VAC_CUR_HEAP_BLK              3
... so on.

Then, to report a changed parameter:

pgstat_progress_update_param(PROG_PAR_VAC_PHASE_ID, LV_PHASE_SCAN_HEAP);
...
pgstat_progress_update_param(PROG_PAR_VAC_CUR_HEAP_BLK, blkno);

by the way, following is proargnames[] for pg_stat_get_progress_info():

cmdtype integer,
OUT pid integer,
OUT param1 integer,
OUT param2 integer,
...
OUT param10 integer

So, it is a responsibility of a command specific progress view definition
that it interprets values of param1..param10 appropriately.  In fact, the
implementer of the progress reporting for a command determines what goes
into which slot of st_progress_param[], to begin with.

> pgstat_progress_get_num_param looks not good in the aspect of
> genericity. I'd like to define it as an integer array by idexed
> by the command type if it is needed. However it seems to me to be
> enough that pg_stat_get_progress_info always returns 10 integers
> regardless of what the numbers are for. The user sql function,
> pg_stat_vacuum_progress as the first user, knows how many numbers
> should be read for its work. It reads zeroes safely even if it
> reads more than what the producer side offered (unless it tries
> to divide something with it).

Thinking a bit, perhaps we don't need num_param(cmdtpye) function or array
at all as you seem to suggest.  It serves no useful purpose now that I see
it. pg_stat_get_progress_info() should simply copy
st_progress_param[0...PG_STAT_GET_PROGRESS_COLS-1] to the result and view
definer knows what's what.

Attached updated patches which incorporate above mentioned changes.  If
Vinayak has something else in mind about anything, he can weigh in.

Thanks,
Amit
>From 52a398f5104cd50f8bfcc4fd1fbb5bb102eddbf5 Mon Sep 17 00:00:00 2001
From: Amit <amitlangot...@gmail.com>
Date: Sun, 28 Feb 2016 01:50:07 +0900
Subject: [PATCH 1/2] Provide a way for utility commands to report progress

Commands can update values in shared memory using:

  pgstat_progress_update_param(param_index, param_value)

Up to 10 independent unsigned integer values can be published by commands.
In addition to those, a command should always report its BackendCommandType
at the beginning of the processing using:

  pgstat_progress_set_command(cmdtype)

A view can be defined in system_views.sql that outputs the values returned
by pg_stat_get_progress_info(cmdtype), where 'cmdtype' is numeric value as
mentioned above.  Each such view has columns corresponding to the counters
published by respective commands.

There is a SQL-callable function pg_stat_reset_local_progress() which
when called, resets the progress information of the backend of the session
in which its called.  It is useful to erase progress info of commands
previously run in the session.
---
 doc/src/sgml/monitoring.sgml        |    8 +++
 src/backend/postmaster/pgstat.c     |   60 ++++++++++++++++++++++
 src/backend/utils/adt/pgstatfuncs.c |   93 +++++++++++++++++++++++++++++++++++
 src/include/catalog/pg_proc.h       |    4 ++
 src/include/pgstat.h                |   23 +++++++++
 5 files changed, 188 insertions(+), 0 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 85459d0..45d9ed7 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -1935,6 +1935,14 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
        zero (requires superuser privileges)
       </entry>
      </row>
+
+     <row>
+      <entry><literal><function>pg_stat_reset_local_progress</function>()</literal><indexterm><primary>pg_stat_reset_local_progress</primary></indexterm></entry>
+      <entry><type>void</type></entry>
+      <entry>
+       Reset command progress parameters of local backend
+      </entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index da768c6..5e238cc 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -2731,6 +2731,8 @@ pgstat_bestart(void)
 	beentry->st_clienthostname[NAMEDATALEN - 1] = '\0';
 	beentry->st_appname[NAMEDATALEN - 1] = '\0';
 	beentry->st_activity[pgstat_track_activity_query_size - 1] = '\0';
+	beentry->st_command = COMMAND_INVALID;
+	MemSet(&beentry->st_progress_param, 0, sizeof(beentry->st_progress_param));
 
 	pgstat_increment_changecount_after(beentry);
 
@@ -2851,6 +2853,64 @@ pgstat_report_activity(BackendState state, const char *cmd_str)
 	pgstat_increment_changecount_after(beentry);
 }
 
+/*-----------
+ * pgstat_progress_update_param() -
+ *
+ * Update index'th member in st_progress_param[] of own backend entry.
+ *-----------
+ */
+void
+pgstat_progress_update_param(int index, uint32 val)
+{
+	volatile PgBackendStatus *beentry = MyBEEntry;
+
+	if(!beentry)
+		return;
+
+	if (!pgstat_track_activities)
+		return;
+
+	pgstat_increment_changecount_before(beentry);
+	beentry->st_progress_param[index] = val;
+	pgstat_increment_changecount_after(beentry);
+}
+
+/*-----------
+ * pgstat_progress_start_command()-
+ *
+ * Set st_command in own backend entry.  Also, zero-initialize
+ * st_progress_param array.
+ *-----------
+ */
+void
+pgstat_progress_start_command(BackendCommandType cmdtype)
+{
+	volatile PgBackendStatus *beentry = MyBEEntry;
+
+	pgstat_increment_changecount_before(beentry);
+	beentry->st_command = cmdtype;
+	MemSet(&beentry->st_progress_param, 0,
+		   sizeof(beentry->st_progress_param));
+	pgstat_increment_changecount_after(beentry);
+}
+
+/*--------
+ * pgstat_reset_local_progress()-
+ *
+ * Reset local backend's progress parameters. Setting st_command to
+ * COMMAND_INVALID will do.
+ *--------
+ */
+void
+pgstat_reset_local_progress()
+{
+	PgBackendStatus *beentry = MyBEEntry;
+
+	pgstat_increment_changecount_before(beentry);
+	beentry->st_command = COMMAND_INVALID;
+	pgstat_increment_changecount_after(beentry);
+}
+
 /* ----------
  * pgstat_report_appname() -
  *
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 1b22fcc..a73ae8d 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -64,6 +64,8 @@ extern Datum pg_stat_get_backend_xact_start(PG_FUNCTION_ARGS);
 extern Datum pg_stat_get_backend_start(PG_FUNCTION_ARGS);
 extern Datum pg_stat_get_backend_client_addr(PG_FUNCTION_ARGS);
 extern Datum pg_stat_get_backend_client_port(PG_FUNCTION_ARGS);
+extern Datum pg_stat_get_progress_info(PG_FUNCTION_ARGS);
+extern Datum pg_stat_reset_local_progress(PG_FUNCTION_ARGS);
 
 extern Datum pg_stat_get_db_numbackends(PG_FUNCTION_ARGS);
 extern Datum pg_stat_get_db_xact_commit(PG_FUNCTION_ARGS);
@@ -524,6 +526,97 @@ pg_stat_get_backend_idset(PG_FUNCTION_ARGS)
 	}
 }
 
+#define PG_STAT_GET_PROGRESS_COLS	N_PROGRESS_PARAM + 1
+
+/*
+ * Returns progress parameter values of backends running a given command
+ */
+Datum
+pg_stat_get_progress_info(PG_FUNCTION_ARGS)
+{
+	int			num_backends = pgstat_fetch_stat_numbackends();
+	int			curr_backend;
+	int32		cmdtype = PG_GETARG_INT32(0);
+	TupleDesc	tupdesc;
+	Tuplestorestate *tupstore;
+	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+	MemoryContext per_query_ctx;
+	MemoryContext oldcontext;
+
+	/* check to see if caller supports us returning a tuplestore */
+	if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("set-valued function called in context that cannot accept a set")));
+	if (!(rsinfo->allowedModes & SFRM_Materialize))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("materialize mode required, but it is not " \
+						"allowed in this context")));
+
+	/* Build a tuple descriptor for our result type */
+	if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+		elog(ERROR, "return type must be a row type");
+
+	per_query_ctx = rsinfo->econtext->ecxt_per_query_memory;
+	oldcontext = MemoryContextSwitchTo(per_query_ctx);
+
+	tupstore = tuplestore_begin_heap(true, false, work_mem);
+	rsinfo->returnMode = SFRM_Materialize;
+	rsinfo->setResult = tupstore;
+	rsinfo->setDesc = tupdesc;
+	MemoryContextSwitchTo(oldcontext);
+
+	for (curr_backend = 1; curr_backend <= num_backends; curr_backend++)
+	{
+		LocalPgBackendStatus   *local_beentry;
+		PgBackendStatus		   *beentry;
+		Datum		values[PG_STAT_GET_PROGRESS_COLS];
+		bool		nulls[PG_STAT_GET_PROGRESS_COLS];
+		int			i;
+
+		MemSet(values, 0, sizeof(values));
+		MemSet(nulls, 0, sizeof(nulls));
+
+		local_beentry = pgstat_fetch_stat_local_beentry(curr_backend);
+
+		if (!local_beentry)
+			continue;
+
+		beentry = &local_beentry->backendStatus;
+
+		/*
+		 * Report values for only those backends which are running the given
+		 * command.  XXX - privilege check is maybe dubious.
+		 */
+		if (!beentry ||
+			beentry->st_command != cmdtype ||
+			!has_privs_of_role(GetUserId(), beentry->st_userid))
+			continue;
+
+		/* Values available to all callers */
+		values[0] = Int32GetDatum(beentry->st_procpid);
+		for(i = 0; i < N_PROGRESS_PARAM; i++)
+			values[i+1] = UInt32GetDatum(beentry->st_progress_param[i]);
+
+		tuplestore_putvalues(tupstore, tupdesc, values, nulls);
+	}
+
+	/* clean up and return the tuplestore */
+	tuplestore_donestoring(tupstore);
+
+	return (Datum) 0;
+}
+
+/* Reset local backend's command progress info */
+Datum
+pg_stat_reset_local_progress(PG_FUNCTION_ARGS)
+{
+	pgstat_reset_local_progress();
+
+	PG_RETURN_VOID();
+}
+
 /*
  * Returns activity of PG backends.
  */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index cbbb883..67dc31d 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2710,6 +2710,8 @@ DATA(insert OID = 1936 (  pg_stat_get_backend_idset		PGNSP PGUID 12 1 100 0 0 f
 DESCR("statistics: currently active backend IDs");
 DATA(insert OID = 2022 (  pg_stat_get_activity			PGNSP PGUID 12 1 100 0 0 f f f f f t s r 1 0 2249 "23" "{23,26,23,26,25,25,25,16,1184,1184,1184,1184,869,25,23,28,28,16,25,25,23,16,25}" "{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}" "{pid,datid,pid,usesysid,application_name,state,query,waiting,xact_start,query_start,backend_start,state_change,client_addr,client_hostname,client_port,backend_xid,backend_xmin,ssl,sslversion,sslcipher,sslbits,sslcompression,sslclientdn}" _null_ _null_ pg_stat_get_activity _null_ _null_ _null_ ));
 DESCR("statistics: information about currently active backends");
+DATA(insert OID = 3318 (  pg_stat_get_progress_info           PGNSP PGUID 12 1 100 0 0 f f f f f t s r 1 0 2249 "23" "{23,23,23,23,23,23,23,23,23,23,23,23}" "{i,o,o,o,o,o,o,o,o,o,o,o}" "{cmdtype,pid,param1,param2,param3,param4,param5,param6,param7,param8,param9,param10}" _null_ _null_ pg_stat_get_progress_info _null_ _null_ _null_ ));
+DESCR("statistics: information about progress of backends running maintenance command");
 DATA(insert OID = 3099 (  pg_stat_get_wal_senders	PGNSP PGUID 12 1 10 0 0 f f f f f t s r 0 0 2249 "" "{23,25,3220,3220,3220,3220,23,25}" "{o,o,o,o,o,o,o,o}" "{pid,state,sent_location,write_location,flush_location,replay_location,sync_priority,sync_state}" _null_ _null_ pg_stat_get_wal_senders _null_ _null_ _null_ ));
 DESCR("statistics: information about currently active replication");
 DATA(insert OID = 3317 (  pg_stat_get_wal_receiver	PGNSP PGUID 12 1 0 0 0 f f f f f f s r 0 0 2249 "" "{23,25,3220,23,3220,23,1184,1184,3220,1184,25}" "{o,o,o,o,o,o,o,o,o,o,o}" "{pid,status,receive_start_lsn,receive_start_tli,received_lsn,received_tli,last_msg_send_time,last_msg_receipt_time,latest_end_lsn,latest_end_time,slot_name}" _null_ _null_ pg_stat_get_wal_receiver _null_ _null_ _null_ ));
@@ -2849,6 +2851,8 @@ DATA(insert OID = 3776 (  pg_stat_reset_single_table_counters	PGNSP PGUID 12 1 0
 DESCR("statistics: reset collected statistics for a single table or index in the current database");
 DATA(insert OID = 3777 (  pg_stat_reset_single_function_counters	PGNSP PGUID 12 1 0 0 0 f f f f t f v s 1 0 2278 "26" _null_ _null_ _null_ _null_ _null_	pg_stat_reset_single_function_counters _null_ _null_ _null_ ));
 DESCR("statistics: reset collected statistics for a single function in the current database");
+DATA(insert OID = 3319 (  pg_stat_reset_local_progress			PGNSP PGUID 12 1 0 0 0 f f f f f f v s 0 0 2278 "" _null_ _null_ _null_ _null_ _null_ pg_stat_reset_local_progress _null_ _null_ _null_ ));
+DESCR("statistics: reset progress information of the local backend");
 
 DATA(insert OID = 3163 (  pg_trigger_depth				PGNSP PGUID 12 1 0 0 0 f f f f t f s s 0 0 23 "" _null_ _null_ _null_ _null_ _null_ pg_trigger_depth _null_ _null_ _null_ ));
 DESCR("current trigger depth");
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 65e968e..17fae7d 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -696,6 +696,17 @@ typedef enum BackendState
 } BackendState;
 
 /* ----------
+ * Command type for progress reporting purposes
+ * ----------
+ */
+typedef enum BackendCommandType
+{
+	COMMAND_INVALID = 0,
+} BackendCommandType;
+
+#define N_PROGRESS_PARAM	10
+
+/* ----------
  * Shared-memory data structures
  * ----------
  */
@@ -776,6 +787,15 @@ typedef struct PgBackendStatus
 
 	/* current command string; MUST be null-terminated */
 	char	   *st_activity;
+
+	/*
+	 * Progress parameters of currently running utility command in the
+	 * backend.  Different commands store different number of up to
+	 * N_PROGRESS_PARAM values in st_progress_param.  However, each command
+	 * must set st_command at the beginning of command processing.
+	 */
+	BackendCommandType	st_command;
+	uint32				st_progress_param[N_PROGRESS_PARAM];
 } PgBackendStatus;
 
 /*
@@ -935,6 +955,9 @@ extern void pgstat_report_waiting(bool waiting);
 extern const char *pgstat_get_backend_current_activity(int pid, bool checkUser);
 extern const char *pgstat_get_crashed_backend_activity(int pid, char *buffer,
 									int buflen);
+extern void pgstat_progress_start_command(BackendCommandType cmdtype);
+extern void pgstat_progress_update_param(int index, uint32 val);
+extern void pgstat_reset_local_progress(void);
 
 extern PgStat_TableStatus *find_tabstat_entry(Oid rel_id);
 extern PgStat_BackendFunctionEntry *find_funcstat_entry(Oid func_id);
-- 
1.7.1

>From a3d80139d89019820455aea196ade7c814317ece Mon Sep 17 00:00:00 2001
From: amit <amitlangot...@gmail.com>
Date: Mon, 7 Mar 2016 14:38:34 +0900
Subject: [PATCH 2/2] Implement progress reporting for VACUUM command.

This basically utilizes the pgstat_progress* API to report a handful of
paramters to indicate its progress.  lazy_vacuum_rel() and lazy_scan_heap()
have been altered to report command start, command target table, and the
following parameters: processing phase, number of heap blocks, number of
index blocks (all indexes), current block number in the main scan loop
(whenever changes), index blocks vacuumed (once per finished index vacuum),
and number of index vacuum passes (every time when all indexes are vacuumed).
Following processing phases are identified and reported whenever it changes:
'scanning heap', 'vacuuming indexes', 'vacuuming heap', 'cleanup', 'done'.
The last value is really a misnomer but maybe clearer when someone is staring
at the progress view being polled.

A view named pg_stat_vacuum_progress has been added that shows these values.
---
 doc/src/sgml/monitoring.sgml         |  114 ++++++++++++++++++++++++++++++++++
 src/backend/catalog/system_views.sql |   23 +++++++
 src/backend/commands/vacuumlazy.c    |   82 ++++++++++++++++++++++++-
 src/include/pgstat.h                 |    1 +
 src/test/regress/expected/rules.out  |   20 ++++++
 5 files changed, 239 insertions(+), 1 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 45d9ed7..e4361ad 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -507,6 +507,13 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
       yet included in <structname>pg_stat_user_functions</>).</entry>
      </row>
 
+     <row>
+      <entry><structname>pg_stat_vacuum_progress</><indexterm><primary>pg_stat_vacuum_progress</primary></indexterm></entry>
+      <entry>One row for each backend (including autovacuum worker processes) running
+      <command>VACUUM</>, showing current progress in terms of heap pages it
+      has finished processing.  Note that the backends running
+      <command>VACUUM FULL</> are not included.</entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
@@ -1822,6 +1829,113 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
    controls exactly which functions are tracked.
   </para>
 
+  <table id="pg-stat-vacuum-progress" xreflabel="pg_stat_vacuum_progress">
+   <title><structname>pg_stat_vacuum_progress</structname> View</title>
+   <tgroup cols="3">
+    <thead>
+    <row>
+      <entry>Column</entry>
+      <entry>Type</entry>
+      <entry>Description</entry>
+     </row>
+    </thead>
+
+   <tbody>
+    <row>
+     <entry><structfield>pid</></entry>
+     <entry><type>integer</></entry>
+     <entry>Process ID of backend</entry>
+    </row>
+    <row>
+     <entry><structfield>relid</></entry>
+     <entry><type>oid</></entry>
+     <entry>OID of a table</entry>
+    </row>
+    <row>
+     <entry><structfield>processing_phase</></entry>
+     <entry><type>integer</></entry>
+     <entry>Current phase of vacuum.
+       Possible values are:
+       <itemizedlist>
+        <listitem>
+         <para>
+          <literal>scanning heap</>
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <literal>vacuuming indexes</>
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <literal>vacuuming heap</>
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <literal>cleanup</>
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <literal>done</>
+         </para>
+        </listitem>
+       </itemizedlist>
+     </entry>
+    </row>
+    <row>
+     <entry><structfield>total_heap_blks</></entry>
+     <entry><type>integer</></entry>
+     <entry>Total number of heap blocks in the table</entry>
+    </row>
+    <row>
+     <entry><structfield>current_heap_blkno</></entry>
+     <entry><type>integer</></entry>
+     <entry>Current heap block being processed</entry>
+    </row>
+    <row>
+     <entry><structfield>total_index_blks</></entry>
+     <entry><type>integer</></entry>
+     <entry>Total number of index blocks to be processed</entry>
+    </row>
+    <row>
+     <entry><structfield>index_blks_done</></entry>
+     <entry><type>integer</></entry>
+     <entry>Number of index blocks processed</entry>
+    </row>
+    <row>
+     <entry><structfield>index_scan_count</></entry>
+     <entry><type>integer</></entry>
+     <entry>Number of times index scans has been performed so far</entry>
+    </row>
+    <row>
+     <entry><structfield>percent_done</></entry>
+     <entry><type>numeric</></entry>
+     <entry>
+      Amount of work finished in percent in terms of table blocks processed
+     </entry>
+    </row>
+   </tbody>
+   </tgroup>
+  </table>
+
+  <para>
+   The <structname>pg_stat_vacuum_progress</structname> view will contain
+   one row for each backend (including autovacuum worker processes), showing
+   progress of <command>VACUUM</> running in it. Note that the backends
+   running <command>VACUUM FULL</> are not shown.
+  </para>
+
+  <para>
+   When interpreting the value of the <structfield>percent_done</> column, also
+   note the value of <structfield>processing_phase</>.  It's possible for the
+   former to be <literal>100.00</literal>, while the <command>VACUUM</> still
+   has not returned.  In that case, wait for the latter to turn to the value
+   <literal>done</literal>.
+  </para>
+
  </sect2>
 
  <sect2 id="monitoring-stats-functions">
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index abf9a70..156d379 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -971,3 +971,26 @@ RETURNS jsonb
 LANGUAGE INTERNAL
 STRICT IMMUTABLE
 AS 'jsonb_set';
+
+CREATE VIEW pg_stat_vacuum_progress AS
+    SELECT
+           S.pid AS pid,
+           S.param1 AS relid,
+           CASE S.param2
+               WHEN 1 THEN 'Scanning Heap'
+               WHEN 2 THEN 'Vacuuming Index'
+               WHEN 3 THEN 'Vacuuming Heap'
+               WHEN 4 THEN 'Cleanup'
+               WHEN 5 THEN 'Done'
+               ELSE 'Unknown'
+           END AS processing_phase,
+           S.param3 AS total_heap_blks,
+           S.param4 AS current_heap_blkno,
+           S.param5 AS total_index_blks,
+           S.param6 AS index_blks_done,
+           S.param7 AS index_scan_count,
+           CASE S.param3
+			  WHEN 0 THEN 100::numeric(5, 2)
+			  ELSE ((S.param4 + 1)::numeric / S.param3 * 100)::numeric(5, 2)
+		   END AS percent_done
+   FROM pg_stat_get_progress_info(1) AS S;
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index 8f7b248..73ccd53 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -97,6 +97,29 @@
  */
 #define SKIP_PAGES_THRESHOLD	((BlockNumber) 32)
 
+/*
+ * Follwing progress parameters for lazy vacuum are reported to pgstat
+ */
+#define PROG_PAR_VAC_RELID				0
+#define PROG_PAR_VAC_PHASE_ID			1
+#define PROG_PAR_VAC_HEAP_BLKS			2
+#define PROG_PAR_VAC_CUR_HEAP_BLK		3
+#define PROG_PAR_VAC_IDX_BLKS			4
+#define PROG_PAR_VAC_IDX_BLKS_DONE		5
+#define PROG_PAR_VAC_N_IDX_SCAN			6
+
+/*
+ * Following distinct phases of lazy vacuum are identified.  Although #1, #2
+ * and #3 run in a cyclical manner due to possibly limited memory to work
+ * with, wherein #1 is periodically interrupted to run #2 followed by #3
+ * and back, until all the blocks of the relations have been covered by #1.
+ */
+#define LV_PHASE_SCAN_HEAP			1
+#define LV_PHASE_VACUUM_INDEX		2
+#define LV_PHASE_VACUUM_HEAP		3
+#define LV_PHASE_CLEANUP			4
+#define LV_PHASE_DONE				5
+
 typedef struct LVRelStats
 {
 	/* hasindex = true means two-pass strategy; false means one-pass */
@@ -195,6 +218,10 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 
 	Assert(params != NULL);
 
+	/* initialize pgstat progress info */
+	pgstat_progress_start_command(COMMAND_LAZY_VACUUM);
+	pgstat_progress_update_param(PROG_PAR_VAC_RELID, RelationGetRelid(onerel));
+
 	/* measure elapsed time iff autovacuum logging requires it */
 	if (IsAutoVacuumWorkerProcess() && params->log_min_duration >= 0)
 	{
@@ -270,6 +297,9 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 	/* Vacuum the Free Space Map */
 	FreeSpaceMapVacuum(onerel);
 
+	/* We're done doing any heavy handling, so report */
+	pgstat_progress_update_param(PROG_PAR_VAC_PHASE_ID, LV_PHASE_DONE);
+
 	/*
 	 * Update statistics in pg_class.
 	 *
@@ -433,7 +463,9 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool scan_all)
 {
 	BlockNumber nblocks,
-				blkno;
+				blkno,
+				total_index_blks,
+			   *current_index_blks;
 	HeapTupleData tuple;
 	char	   *relname;
 	BlockNumber empty_pages,
@@ -474,6 +506,24 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
 	lazy_space_alloc(vacrelstats, nblocks);
 	frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
 
+	/* about to begin heap scan */
+	pgstat_progress_update_param(PROG_PAR_VAC_PHASE_ID, LV_PHASE_SCAN_HEAP);
+
+	/* total_heap_blks */
+	pgstat_progress_update_param(PROG_PAR_VAC_HEAP_BLKS, nblocks);
+
+	/* total_index_blks */
+	current_index_blks = (BlockNumber *) palloc(nindexes * sizeof(BlockNumber));
+	total_index_blks = 0;
+	for (i = 0; i < nindexes; i++)
+	{
+		BlockNumber		nblocks = RelationGetNumberOfBlocks(Irel[i]);
+
+		current_index_blks[i] = nblocks;
+		total_index_blks += nblocks;
+	}
+	pgstat_progress_update_param(PROG_PAR_VAC_IDX_BLKS, total_index_blks);
+
 	/*
 	 * We want to skip pages that don't require vacuuming according to the
 	 * visibility map, but only when we can skip at least SKIP_PAGES_THRESHOLD
@@ -581,6 +631,9 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
 
 		vacuum_delay_point();
 
+		/* current_heap_blkno: 0..nblocks-1 */
+		pgstat_progress_update_param(PROG_PAR_VAC_CUR_HEAP_BLK, blkno);
+
 		/*
 		 * If we are close to overrunning the available space for dead-tuple
 		 * TIDs, pause and do a cycle of vacuuming before we tackle this page.
@@ -604,11 +657,22 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
 			vacuum_log_cleanup_info(onerel, vacrelstats);
 
 			/* Remove index entries */
+			pgstat_progress_update_param(PROG_PAR_VAC_PHASE_ID, LV_PHASE_VACUUM_INDEX);
 			for (i = 0; i < nindexes; i++)
+			{
 				lazy_vacuum_index(Irel[i],
 								  &indstats[i],
 								  vacrelstats);
+
+				pgstat_progress_update_param(PROG_PAR_VAC_IDX_BLKS_DONE,
+											 current_index_blks[i]);
+			}
+
+			pgstat_progress_update_param(PROG_PAR_VAC_N_IDX_SCAN,
+										 vacrelstats->num_index_scans+1);
+
 			/* Remove tuples from heap */
+			pgstat_progress_update_param(PROG_PAR_VAC_PHASE_ID, LV_PHASE_VACUUM_HEAP);
 			lazy_vacuum_heap(onerel, vacrelstats);
 
 			/*
@@ -618,6 +682,10 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
 			 */
 			vacrelstats->num_dead_tuples = 0;
 			vacrelstats->num_index_scans++;
+
+			/* go back to scanning the heap */
+			pgstat_progress_update_param(PROG_PAR_VAC_PHASE_ID,
+										 LV_PHASE_SCAN_HEAP);
 		}
 
 		/*
@@ -1153,17 +1221,29 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
 		/* Log cleanup info before we touch indexes */
 		vacuum_log_cleanup_info(onerel, vacrelstats);
 
+		pgstat_progress_update_param(PROG_PAR_VAC_PHASE_ID, LV_PHASE_VACUUM_INDEX);
+
 		/* Remove index entries */
 		for (i = 0; i < nindexes; i++)
+		{
 			lazy_vacuum_index(Irel[i],
 							  &indstats[i],
 							  vacrelstats);
+
+			pgstat_progress_update_param(PROG_PAR_VAC_IDX_BLKS_DONE,
+										 current_index_blks[i]);
+		}
+		pgstat_progress_update_param(PROG_PAR_VAC_N_IDX_SCAN,
+									 vacrelstats->num_index_scans + 1);
+
 		/* Remove tuples from heap */
+		pgstat_progress_update_param(PROG_PAR_VAC_PHASE_ID, LV_PHASE_VACUUM_HEAP);
 		lazy_vacuum_heap(onerel, vacrelstats);
 		vacrelstats->num_index_scans++;
 	}
 
 	/* Do post-vacuum cleanup and statistics update for each index */
+	pgstat_progress_update_param(PROG_PAR_VAC_PHASE_ID, LV_PHASE_CLEANUP);
 	for (i = 0; i < nindexes; i++)
 		lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);
 
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 17fae7d..eca451c 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -702,6 +702,7 @@ typedef enum BackendState
 typedef enum BackendCommandType
 {
 	COMMAND_INVALID = 0,
+	COMMAND_LAZY_VACUUM
 } BackendCommandType;
 
 #define N_PROGRESS_PARAM	10
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 81bc5c9..1d14272 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1851,6 +1851,26 @@ pg_stat_user_tables| SELECT pg_stat_all_tables.relid,
     pg_stat_all_tables.autoanalyze_count
    FROM pg_stat_all_tables
   WHERE ((pg_stat_all_tables.schemaname <> ALL (ARRAY['pg_catalog'::name, 'information_schema'::name])) AND (pg_stat_all_tables.schemaname !~ '^pg_toast'::text));
+pg_stat_vacuum_progress| SELECT s.pid,
+    s.param1 AS relid,
+        CASE s.param2
+            WHEN 1 THEN 'Scanning Heap'::text
+            WHEN 2 THEN 'Vacuuming Index'::text
+            WHEN 3 THEN 'Vacuuming Heap'::text
+            WHEN 4 THEN 'Cleanup'::text
+            WHEN 5 THEN 'Done'::text
+            ELSE 'Unknown'::text
+        END AS processing_phase,
+    s.param3 AS total_heap_blks,
+    s.param4 AS current_heap_blkno,
+    s.param5 AS total_index_blks,
+    s.param6 AS index_blks_done,
+    s.param7 AS index_scan_count,
+        CASE s.param2
+            WHEN 0 THEN 0.0
+            ELSE (((((s.param3 + 1))::numeric / (s.param2)::numeric) * (100)::numeric))::numeric(5,2)
+        END AS percent_done
+   FROM pg_stat_get_progress_info(1) s(pid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10);
 pg_stat_wal_receiver| SELECT s.pid,
     s.status,
     s.receive_start_lsn,
-- 
1.7.1

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to