[ Changing subject line in the hopes of keeping separate issues in
separate threads. ]

On Mon, Jun 6, 2016 at 11:00 AM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Robert Haas <robertmh...@gmail.com> writes:
>> I'm intuitively sympathetic to the idea that we should have an option
>> for this, but I can't figure out in what case we'd actually tell
>> anyone to use it.  It would be useful for the kinds of bugs listed
>> above to have VACUUM (rebuild_vm) to blow away the VM fork and rebuild
>> it, but that's different semantics than what we proposed for VACUUM
>> (even_frozen_pages).  And I'd be sort of inclined to handle that case
>> by providing some other way to remove VM forks (like a new function in
>> the pg_visibilitymap contrib module, maybe?) and then just tell people
>> to run regular VACUUM afterwards, rather than putting the actual VM
>> fork removal into VACUUM.
>
> There's a lot to be said for that approach.  If we do it, I'd be a bit
> inclined to offer an option to blow away the FSM as well.

After having been scared out of my mind by the discovery of
longstanding breakage in heap_update[1], it occurred to me that this
is an excellent example of a case in which the option for which Andres
was agitating - specifically forcing VACUUM to visit ever single page
in the heap - would be useful.  Even if there's functionality
available to blow away the visibility map forks, you might not want to
just go remove them all and re-vacuum everything, because while you
were rebuilding the VM forks, index-only scans would stop being
index-only, and that might cause an operational problem.  Being able
to simply force every page to be visited is better.  Patch for that
attached.  I decided to call the option disable_page_skipping rather
than even_frozen_pages because it should also force visiting pages
that are all-visible but not all-frozen.  (I was sorely tempted to go
with the competing proposal of calling it VACUUM (THEWHOLEDAMNTHING)
but I refrained.)

However, I also think that the approach described above - providing a
way to excise VM forks specifically - is useful.  Patch for that
attached, too.  It turns out that can't be done without either adding
a new WAL record type or extending an existing one; I chose to add a
"flags" argument to XLOG_SMGR_TRUNCATE, specifying the forks to be
truncated.  Since this will require bumping the XLOG version, if we're
going to do it, and I think we should, it would be good to try to get
it done before beta2 rather than closer to release.  However, I don't
want to commit this without some review, so please review this.
Actually, please review both patches.[2]  The same core support could
be used to add an option to truncate the FSM, but I didn't write a
patch for that because I'm incredibly { busy | lazy }.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1] 
https://www.postgresql.org/message-id/ca+tgmoz-1txcsxwcp3i-kmo5dnb-6p-odqw7c-n_q3pzzy1...@mail.gmail.com
[2] This request is directed at the list generally, not at any
specific individual ... well, OK, I mean you.  Yeah, you!
From 010e99b403ec733d50c71a7d4ef646b1b446ef07 Mon Sep 17 00:00:00 2001
From: Robert Haas <rh...@postgresql.org>
Date: Wed, 15 Jun 2016 22:52:58 -0400
Subject: [PATCH 2/2] Add VACUUM (DISABLE_PAGE_SKIPPING) for emergencies.

---
 doc/src/sgml/ref/vacuum.sgml         | 21 ++++++++-
 src/backend/commands/vacuum.c        |  9 ++++
 src/backend/commands/vacuumlazy.c    | 87 ++++++++++++++++++++----------------
 src/backend/parser/gram.y            | 10 +++++
 src/include/nodes/parsenodes.h       |  3 +-
 src/test/regress/expected/vacuum.out |  1 +
 src/test/regress/sql/vacuum.sql      |  2 +
 7 files changed, 92 insertions(+), 41 deletions(-)

diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 19fd748..dee1c5b 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  <refsynopsisdiv>
 <synopsis>
-VACUUM [ ( { FULL | FREEZE | VERBOSE | ANALYZE } [, ...] ) ] [ <replaceable class="PARAMETER">table_name</replaceable> [ (<replaceable class="PARAMETER">column_name</replaceable> [, ...] ) ] ]
+VACUUM [ ( { FULL | FREEZE | VERBOSE | ANALYZE | DISABLE_PAGE_SKIPPING } [, ...] ) ] [ <replaceable class="PARAMETER">table_name</replaceable> [ (<replaceable class="PARAMETER">column_name</replaceable> [, ...] ) ] ]
 VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ <replaceable class="PARAMETER">table_name</replaceable> ]
 VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] ANALYZE [ <replaceable class="PARAMETER">table_name</replaceable> [ (<replaceable class="PARAMETER">column_name</replaceable> [, ...] ) ] ]
 </synopsis>
@@ -130,6 +130,25 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] ANALYZE [ <replaceable class="PARAMETER">
    </varlistentry>
 
    <varlistentry>
+    <term><literal>DISABLE_PAGE_SKIPPING</literal></term>
+    <listitem>
+     <para>
+      Normally, <command>VACUUM</> will skip pages based on the <link
+      linkend="vacuum-for-visibility-map">visibility map</>.  Pages where
+      all tuples are known to be frozen can always be skipped, and those
+      where all tuples are known to be visible to all transactions may be
+      skipped except when performing an aggressive vacuum.  Furthermore,
+      except when performing an aggressive vacuum, some pages may be skipped
+      in order to avoid waiting for other sessions to finish using them.
+      This option disables all page-skipping behavior, and is intended to
+      be used only the contents of the visibility map are thought to
+      be suspect, which should happen only if there is a hardware or software
+      issue causing database corruption.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
     <term><replaceable class="PARAMETER">table_name</replaceable></term>
     <listitem>
      <para>
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 25a55ab..0563e63 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -186,6 +186,15 @@ vacuum(int options, RangeVar *relation, Oid relid, VacuumParams *params,
 						stmttype)));
 
 	/*
+	 * Sanity check DISABLE_PAGE_SKIPPING option.
+	 */
+	if ((options & VACOPT_FULL) != 0 &&
+		(options & VACOPT_DISABLE_PAGE_SKIPPING) != 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("VACUUM option DISABLE_PAGE_SKIPPING cannot be used with FULL")));
+
+	/*
 	 * Send info about dead objects to the statistics collector, unless we are
 	 * in autovacuum --- autovacuum.c does this for itself.
 	 */
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index cb5777f..7a67fa5 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -137,8 +137,9 @@ static BufferAccessStrategy vac_strategy;
 
 
 /* non-export function prototypes */
-static void lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
-			   Relation *Irel, int nindexes, bool aggressive);
+static void lazy_scan_heap(Relation onerel, int options,
+			   LVRelStats *vacrelstats, Relation *Irel, int nindexes,
+			   bool aggressive);
 static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
 static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
 static void lazy_vacuum_index(Relation indrel,
@@ -223,15 +224,17 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 						  &MultiXactCutoff, &mxactFullScanLimit);
 
 	/*
-	 * We request an aggressive scan if either the table's frozen Xid is now
+	 * We request an aggressive scan if the table's frozen Xid is now
 	 * older than or equal to the requested Xid full-table scan limit; or if
 	 * the table's minimum MultiXactId is older than or equal to the requested
-	 * mxid full-table scan limit.
+	 * mxid full-table scan limit; or if DISABLE_PAGE_SKIPPING was specified.
 	 */
 	aggressive = TransactionIdPrecedesOrEquals(onerel->rd_rel->relfrozenxid,
 											   xidFullScanLimit);
 	aggressive |= MultiXactIdPrecedesOrEquals(onerel->rd_rel->relminmxid,
 											  mxactFullScanLimit);
+	if (options & VACOPT_DISABLE_PAGE_SKIPPING)
+		aggressive = true;
 
 	vacrelstats = (LVRelStats *) palloc0(sizeof(LVRelStats));
 
@@ -246,7 +249,7 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 	vacrelstats->hasindex = (nindexes > 0);
 
 	/* Do the vacuuming */
-	lazy_scan_heap(onerel, vacrelstats, Irel, nindexes, aggressive);
+	lazy_scan_heap(onerel, options, vacrelstats, Irel, nindexes, aggressive);
 
 	/* Done with indexes */
 	vac_close_indexes(nindexes, Irel, NoLock);
@@ -441,7 +444,7 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
  *		reference them have been killed.
  */
 static void
-lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
+lazy_scan_heap(Relation onerel, int options, LVRelStats *vacrelstats,
 			   Relation *Irel, int nindexes, bool aggressive)
 {
 	BlockNumber nblocks,
@@ -542,25 +545,28 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
 	 * the last page.  This is worth avoiding mainly because such a lock must
 	 * be replayed on any hot standby, where it can be disruptive.
 	 */
-	for (next_unskippable_block = 0;
-		 next_unskippable_block < nblocks;
-		 next_unskippable_block++)
+	next_unskippable_block = 0;
+	if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 	{
-		uint8		vmstatus;
-
-		vmstatus = visibilitymap_get_status(onerel, next_unskippable_block,
-											&vmbuffer);
-		if (aggressive)
+		while (next_unskippable_block < nblocks)
 		{
-			if ((vmstatus & VISIBILITYMAP_ALL_FROZEN) == 0)
-				break;
-		}
-		else
-		{
-			if ((vmstatus & VISIBILITYMAP_ALL_VISIBLE) == 0)
-				break;
+			uint8		vmstatus;
+
+			vmstatus = visibilitymap_get_status(onerel, next_unskippable_block,
+												&vmbuffer);
+			if (aggressive)
+			{
+				if ((vmstatus & VISIBILITYMAP_ALL_FROZEN) == 0)
+					break;
+			}
+			else
+			{
+				if ((vmstatus & VISIBILITYMAP_ALL_VISIBLE) == 0)
+					break;
+			}
+			vacuum_delay_point();
+			next_unskippable_block++;
 		}
-		vacuum_delay_point();
 	}
 
 	if (next_unskippable_block >= SKIP_PAGES_THRESHOLD)
@@ -594,26 +600,29 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
 		if (blkno == next_unskippable_block)
 		{
 			/* Time to advance next_unskippable_block */
-			for (next_unskippable_block++;
-				 next_unskippable_block < nblocks;
-				 next_unskippable_block++)
+			next_unskippable_block++;
+			if ((options & VACOPT_DISABLE_PAGE_SKIPPING) == 0)
 			{
-				uint8		vmskipflags;
-
-				vmskipflags = visibilitymap_get_status(onerel,
-													   next_unskippable_block,
-													   &vmbuffer);
-				if (aggressive)
+				while (next_unskippable_block < nblocks)
 				{
-					if ((vmskipflags & VISIBILITYMAP_ALL_FROZEN) == 0)
-						break;
-				}
-				else
-				{
-					if ((vmskipflags & VISIBILITYMAP_ALL_VISIBLE) == 0)
-						break;
+					uint8		vmskipflags;
+
+					vmskipflags = visibilitymap_get_status(onerel,
+													  next_unskippable_block,
+														   &vmbuffer);
+					if (aggressive)
+					{
+						if ((vmskipflags & VISIBILITYMAP_ALL_FROZEN) == 0)
+							break;
+					}
+					else
+					{
+						if ((vmskipflags & VISIBILITYMAP_ALL_VISIBLE) == 0)
+							break;
+					}
+					vacuum_delay_point();
+					next_unskippable_block++;
 				}
-				vacuum_delay_point();
 			}
 
 			/*
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 2c950f9..edf4516 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9370,6 +9370,16 @@ vacuum_option_elem:
 			| VERBOSE			{ $$ = VACOPT_VERBOSE; }
 			| FREEZE			{ $$ = VACOPT_FREEZE; }
 			| FULL				{ $$ = VACOPT_FULL; }
+			| IDENT
+				{
+					if (strcmp($1, "disable_page_skipping") == 0)
+						$$ = VACOPT_DISABLE_PAGE_SKIPPING;
+					else
+						ereport(ERROR,
+								(errcode(ERRCODE_SYNTAX_ERROR),
+							 errmsg("unrecognized VACUUM option \"%s\"", $1),
+									 parser_errposition(@1)));
+				}
 		;
 
 AnalyzeStmt:
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 714cf15..d36d9c6 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -2822,7 +2822,8 @@ typedef enum VacuumOption
 	VACOPT_FREEZE = 1 << 3,		/* FREEZE option */
 	VACOPT_FULL = 1 << 4,		/* FULL (non-concurrent) vacuum */
 	VACOPT_NOWAIT = 1 << 5,		/* don't wait to get lock (autovacuum only) */
-	VACOPT_SKIPTOAST = 1 << 6	/* don't process the TOAST table, if any */
+	VACOPT_SKIPTOAST = 1 << 6,	/* don't process the TOAST table, if any */
+	VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7	/* don't skip any pages */
 } VacuumOption;
 
 typedef struct VacuumStmt
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index d2d7503..9b604be 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -79,5 +79,6 @@ ERROR:  ANALYZE cannot be executed from VACUUM or ANALYZE
 CONTEXT:  SQL function "do_analyze" statement 1
 SQL function "wrap_do_analyze" statement 1
 VACUUM FULL vactst;
+VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
 DROP TABLE vaccluster;
 DROP TABLE vactst;
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index f841201..7b819f6 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -60,5 +60,7 @@ VACUUM FULL pg_database;
 VACUUM FULL vaccluster;
 VACUUM FULL vactst;
 
+VACUUM (DISABLE_PAGE_SKIPPING) vaccluster;
+
 DROP TABLE vaccluster;
 DROP TABLE vactst;
-- 
2.5.4 (Apple Git-61)

From e2b6404c9a9ecb22eec41d010ce786cc27d73fc7 Mon Sep 17 00:00:00 2001
From: Robert Haas <rh...@postgresql.org>
Date: Wed, 15 Jun 2016 11:33:27 -0400
Subject: [PATCH] pg_visibility: Add pg_truncate_visibility_map function.

This requires some core changes as well so that we can properly
WAL-log the truncation.
---
 contrib/pg_visibility/pg_visibility--1.0--1.1.sql |  6 +++
 contrib/pg_visibility/pg_visibility--1.1.sql      |  8 ++++
 contrib/pg_visibility/pg_visibility.c             | 53 +++++++++++++++++++++++
 doc/src/sgml/pgvisibility.sgml                    | 30 ++++++++++---
 src/backend/access/rmgrdesc/smgrdesc.c            |  3 +-
 src/backend/catalog/storage.c                     | 16 ++++---
 src/include/catalog/storage_xlog.h                |  8 ++++
 7 files changed, 112 insertions(+), 12 deletions(-)

diff --git a/contrib/pg_visibility/pg_visibility--1.0--1.1.sql b/contrib/pg_visibility/pg_visibility--1.0--1.1.sql
index 2c97dfd..9336b48 100644
--- a/contrib/pg_visibility/pg_visibility--1.0--1.1.sql
+++ b/contrib/pg_visibility/pg_visibility--1.0--1.1.sql
@@ -13,5 +13,11 @@ RETURNS SETOF tid
 AS 'MODULE_PATHNAME', 'pg_check_visible'
 LANGUAGE C STRICT;
 
+CREATE FUNCTION pg_truncate_visibility_map(regclass)
+RETURNS void
+AS 'MODULE_PATHNAME', 'pg_truncate_visibility_map'
+LANGUAGE C STRICT
+PARALLEL UNSAFE;  -- let's not make this any more dangerous
+
 REVOKE ALL ON FUNCTION pg_check_frozen(regclass) FROM PUBLIC;
 REVOKE ALL ON FUNCTION pg_check_visible(regclass) FROM PUBLIC;
diff --git a/contrib/pg_visibility/pg_visibility--1.1.sql b/contrib/pg_visibility/pg_visibility--1.1.sql
index b49b644..0a29967 100644
--- a/contrib/pg_visibility/pg_visibility--1.1.sql
+++ b/contrib/pg_visibility/pg_visibility--1.1.sql
@@ -57,6 +57,13 @@ RETURNS SETOF tid
 AS 'MODULE_PATHNAME', 'pg_check_visible'
 LANGUAGE C STRICT;
 
+-- Truncate the visibility map fork.
+CREATE FUNCTION pg_truncate_visibility_map(regclass)
+RETURNS void
+AS 'MODULE_PATHNAME', 'pg_truncate_visibility_map'
+LANGUAGE C STRICT
+PARALLEL UNSAFE;  -- let's not make this any more dangerous
+
 -- Don't want these to be available to public.
 REVOKE ALL ON FUNCTION pg_visibility_map(regclass, bigint) FROM PUBLIC;
 REVOKE ALL ON FUNCTION pg_visibility(regclass, bigint) FROM PUBLIC;
@@ -65,3 +72,4 @@ REVOKE ALL ON FUNCTION pg_visibility(regclass) FROM PUBLIC;
 REVOKE ALL ON FUNCTION pg_visibility_map_summary(regclass) FROM PUBLIC;
 REVOKE ALL ON FUNCTION pg_check_frozen(regclass) FROM PUBLIC;
 REVOKE ALL ON FUNCTION pg_check_visible(regclass) FROM PUBLIC;
+REVOKE ALL ON FUNCTION pg_truncate_visibility_map(regclass) FROM PUBLIC;
diff --git a/contrib/pg_visibility/pg_visibility.c b/contrib/pg_visibility/pg_visibility.c
index abb92f3..eb3b0ca 100644
--- a/contrib/pg_visibility/pg_visibility.c
+++ b/contrib/pg_visibility/pg_visibility.c
@@ -11,10 +11,12 @@
 #include "access/htup_details.h"
 #include "access/visibilitymap.h"
 #include "catalog/pg_type.h"
+#include "catalog/storage_xlog.h"
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "storage/bufmgr.h"
 #include "storage/procarray.h"
+#include "storage/smgr.h"
 #include "utils/rel.h"
 
 PG_MODULE_MAGIC;
@@ -40,6 +42,7 @@ PG_FUNCTION_INFO_V1(pg_visibility_rel);
 PG_FUNCTION_INFO_V1(pg_visibility_map_summary);
 PG_FUNCTION_INFO_V1(pg_check_frozen);
 PG_FUNCTION_INFO_V1(pg_check_visible);
+PG_FUNCTION_INFO_V1(pg_truncate_visibility_map);
 
 static TupleDesc pg_visibility_tupdesc(bool include_blkno, bool include_pd);
 static vbits *collect_visibility_data(Oid relid, bool include_pd);
@@ -336,6 +339,56 @@ pg_check_visible(PG_FUNCTION_ARGS)
 }
 
 /*
+ * Remove the visibility map fork for a relation.  If there turn out to be
+ * any bugs in the visibility map code that require rebuilding the VM, this
+ * provides users with a way to do it that is cleaner than shutting down the
+ * server and removing files by hand.
+ *
+ * This is a cut-down version of RelationTruncate.
+ */
+Datum
+pg_truncate_visibility_map(PG_FUNCTION_ARGS)
+{
+	Oid			relid = PG_GETARG_OID(0);
+	Relation	rel;
+
+	rel = relation_open(relid, AccessExclusiveLock);
+
+	if (rel->rd_rel->relkind != RELKIND_RELATION &&
+		rel->rd_rel->relkind != RELKIND_MATVIEW &&
+		rel->rd_rel->relkind != RELKIND_TOASTVALUE)
+		ereport(ERROR,
+				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+		   errmsg("\"%s\" is not a table, materialized view, or TOAST table",
+				  RelationGetRelationName(rel))));
+
+	RelationOpenSmgr(rel);
+	rel->rd_smgr->smgr_vm_nblocks = InvalidBlockNumber;
+
+	visibilitymap_truncate(rel, 0);
+
+	if (RelationNeedsWAL(rel))
+	{
+		xl_smgr_truncate	xlrec;
+
+		xlrec.blkno = 0;
+		xlrec.rnode = rel->rd_node;
+		xlrec.flags = SMGR_TRUNCATE_VM;
+
+		XLogBeginInsert();
+		XLogRegisterData((char *) &xlrec, sizeof(xlrec));
+
+		XLogInsert(RM_SMGR_ID, XLOG_SMGR_TRUNCATE | XLR_SPECIAL_REL_UPDATE);
+	}
+
+	/* Don't keep the relation locked any longer than necessary! */
+	relation_close(rel, AccessExclusiveLock);
+
+	/* Nothing to return. */
+	PG_RETURN_VOID();
+}
+
+/*
  * Helper function to construct whichever TupleDesc we need for a particular
  * call.
  */
diff --git a/doc/src/sgml/pgvisibility.sgml b/doc/src/sgml/pgvisibility.sgml
index 4cdca7d..44e83de 100644
--- a/doc/src/sgml/pgvisibility.sgml
+++ b/doc/src/sgml/pgvisibility.sgml
@@ -9,14 +9,16 @@
 
  <para>
   The <filename>pg_visibility</> module provides a means for examining the
-  visibility map (VM) and page-level visibility information.
+  visibility map (VM) and page-level visibility information.  It also
+  provides functions to check the integrity of the visibility map and to
+  force it to be rebuilt.
  </para>
 
  <para>
-  These routines return information about three different bits.  The
-  all-visible bit in the visibility map indicates that every tuple on
-  a given page of a relation is visible to every current transaction.  The
-  all-frozen bit in the visibility map indicates that every tuple on the
+  Three different bits are used to store information about page-level
+  visibility.  The all-visible bit in the visibility map indicates that every
+  tuple on a given page of a relation is visible to every current transaction.
+  The all-frozen bit in the visibility map indicates that every tuple on the
   page is frozen; that is, no future vacuum will need to modify the page
   until such time as a tuple is inserted, updated, deleted, or locked on
   that page.  The page-level <literal>PD_ALL_VISIBLE</literal> bit has the
@@ -25,7 +27,8 @@
   will normally agree, but the page-level bit can sometimes be set while the
   visibility map bit is clear after a crash recovery; or they can disagree
   because of a change which occurs after <literal>pg_visibility</> examines
-  the visibility map and before it examines the data page.
+  the visibility map and before it examines the data page.  Any event which
+  causes data corruption can also cause these bits to disagree.
  </para>
 
  <para>
@@ -118,6 +121,21 @@
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term><function>pg_truncate_visibility_map(regclass) returns void</function></term>
+
+    <listitem>
+     <para>
+      Truncates the visibility map for the given relation.  This function
+      is only expected to be useful if you suspect that the visibility map
+      for the indicated relation is corrupt and wish to rebuild it.  The first
+      <command>VACUUM</> executed on the given relation after this function
+      is executed will scan every page in the relation and rebuild the
+      visibility map.
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
 
   <para>
diff --git a/src/backend/access/rmgrdesc/smgrdesc.c b/src/backend/access/rmgrdesc/smgrdesc.c
index 0c6e583..242d79a 100644
--- a/src/backend/access/rmgrdesc/smgrdesc.c
+++ b/src/backend/access/rmgrdesc/smgrdesc.c
@@ -37,7 +37,8 @@ smgr_desc(StringInfo buf, XLogReaderState *record)
 		xl_smgr_truncate *xlrec = (xl_smgr_truncate *) rec;
 		char	   *path = relpathperm(xlrec->rnode, MAIN_FORKNUM);
 
-		appendStringInfo(buf, "%s to %u blocks", path, xlrec->blkno);
+		appendStringInfo(buf, "%s to %u blocks flags %d", path,
+						 xlrec->blkno, xlrec->flags);
 		pfree(path);
 	}
 }
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index 67f1906..0d8311c 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -268,6 +268,7 @@ RelationTruncate(Relation rel, BlockNumber nblocks)
 
 		xlrec.blkno = nblocks;
 		xlrec.rnode = rel->rd_node;
+		xlrec.flags = SMGR_TRUNCATE_ALL;
 
 		XLogBeginInsert();
 		XLogRegisterData((char *) &xlrec, sizeof(xlrec));
@@ -522,17 +523,22 @@ smgr_redo(XLogReaderState *record)
 		 */
 		XLogFlush(lsn);
 
-		smgrtruncate(reln, MAIN_FORKNUM, xlrec->blkno);
+		if ((xlrec->flags & SMGR_TRUNCATE_HEAP) != 0)
+		{
+			smgrtruncate(reln, MAIN_FORKNUM, xlrec->blkno);
 
-		/* Also tell xlogutils.c about it */
-		XLogTruncateRelation(xlrec->rnode, MAIN_FORKNUM, xlrec->blkno);
+			/* Also tell xlogutils.c about it */
+			XLogTruncateRelation(xlrec->rnode, MAIN_FORKNUM, xlrec->blkno);
+		}
 
 		/* Truncate FSM and VM too */
 		rel = CreateFakeRelcacheEntry(xlrec->rnode);
 
-		if (smgrexists(reln, FSM_FORKNUM))
+		if ((xlrec->flags & SMGR_TRUNCATE_FSM) != 0 &&
+			smgrexists(reln, FSM_FORKNUM))
 			FreeSpaceMapTruncateRel(rel, xlrec->blkno);
-		if (smgrexists(reln, VISIBILITYMAP_FORKNUM))
+		if ((xlrec->flags & SMGR_TRUNCATE_VM) != 0 &&
+			smgrexists(reln, VISIBILITYMAP_FORKNUM))
 			visibilitymap_truncate(rel, xlrec->blkno);
 
 		FreeFakeRelcacheEntry(rel);
diff --git a/src/include/catalog/storage_xlog.h b/src/include/catalog/storage_xlog.h
index 7207e8b..500e663 100644
--- a/src/include/catalog/storage_xlog.h
+++ b/src/include/catalog/storage_xlog.h
@@ -36,10 +36,18 @@ typedef struct xl_smgr_create
 	ForkNumber	forkNum;
 } xl_smgr_create;
 
+/* flags for xl_smgr_truncate */
+#define SMGR_TRUNCATE_HEAP		0x0001
+#define SMGR_TRUNCATE_VM		0x0002
+#define SMGR_TRUNCATE_FSM		0x0004
+#define SMGR_TRUNCATE_ALL		\
+	(SMGR_TRUNCATE_HEAP|SMGR_TRUNCATE_VM|SMGR_TRUNCATE_FSM)
+
 typedef struct xl_smgr_truncate
 {
 	BlockNumber blkno;
 	RelFileNode rnode;
+	int			flags;
 } xl_smgr_truncate;
 
 extern void log_smgrcreate(RelFileNode *rnode, ForkNumber forkNum);
-- 
2.5.4 (Apple Git-61)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to