On Mon, Mar 7, 2016 at 12:41 PM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
> Attached latest version optimisation patch.
> I'm still consider regarding pg_upgrade regression test code, so I
> will submit that patch later.

I just spent some time looking at this and I'm a bit worried about the
following (existing) comment in vacuumlazy.c:

     * Note: The value returned by visibilitymap_get_status could be slightly
     * out-of-date, since we make this test before reading the corresponding
     * heap page or locking the buffer.  This is OK.  If we mistakenly think
     * that the page is all-visible when in fact the flag's just been cleared,
     * we might fail to vacuum the page.  But it's OK to skip pages when
     * scan_all is not set, so no great harm done; the next vacuum will find
     * them.  If we make the reverse mistake and vacuum a page unnecessarily,
     * it'll just be a no-op.

The patch makes some attempt to update the comment mechanically, but
that's not nearly enough.  That comment is explaining that you *can't*
rely on the visibility map to tell you *for sure* that a page does not
require vacuuming.  For current uses, that's OK, because if we miss a
page we'll pick it up later.  But now we want to skip vacuuming pages
for relfrozenxid/relminmxid advancement, that rationale doesn't apply.
Missing pages that need to be frozen and advancing relfrozenxid anyway
would be _bad_.

However, after some further thought, I think we might actually be OK.
If a page goes from all-frozen to not-all-frozen while VACUUM is
running, any new XID added to the page must be newer than the
oldestXmin value computed by vacuum_set_xid_limits(), so it won't
affect the value to which we can safely set relfrozenxid.  Similarly,
any MXID added to the page will be newer than GetOldestMultiXactId(),
so setting relminmxid is still safe for similar reasons.

I'd appreciate it if any other senior hackers could review that chain
of reasoning.  It would be really bad to get this wrong.

On another note, I didn't really like the way you updated the
documentation.  "eager freezing" doesn't seem like a great term to me,
and I think your changes were a little too localized.  Here's a draft
alternative where I used the term "aggressive vacuum" to describe
freezing all of the pages except for those already known to be
all-frozen.  Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index a09ceb2..2f72633 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -5984,12 +5984,15 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
       </term>
       <listitem>
        <para>
-        <command>VACUUM</> performs a whole-table scan if the table's
+        <command>VACUUM</> performs an aggressive scan if the table's
         <structname>pg_class</>.<structfield>relfrozenxid</> field has reached
-        the age specified by this setting.  The default is 150 million
-        transactions.  Although users can set this value anywhere from zero to
-        two billions, <command>VACUUM</> will silently limit the effective value
-        to 95% of <xref linkend="guc-autovacuum-freeze-max-age">, so that a
+        the age specified by this setting.  An aggressive scan differs from
+        a regular <command>VACUUM</> in that it visits every page that might
+        contain unfrozen XIDs or MXIDs, not just those that might contain dead
+        tuples.  The default is 150 million transactions.  Although users can
+        set this value anywhere from zero to two billions, <command>VACUUM</>
+        will silently limit the effective value to 95% of
+        <xref linkend="guc-autovacuum-freeze-max-age">, so that a
         periodical manual <command>VACUUM</> has a chance to run before an
         anti-wraparound autovacuum is launched for the table. For more
         information see
@@ -6028,9 +6031,12 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
       </term>
       <listitem>
        <para>
-        <command>VACUUM</> performs a whole-table scan if the table's
+        <command>VACUUM</> performs an aggressive scan if the table's
         <structname>pg_class</>.<structfield>relminmxid</> field has reached
-        the age specified by this setting.  The default is 150 million multixacts.
+        the age specified by this setting.  An aggressive scan differs from
+        a regular <command>VACUUM</> in that it visits every page that might
+        contain unfrozen XIDs or MXIDs, not just those that might contain dead
+        tuples.  The default is 150 million multixacts.
         Although users can set this value anywhere from zero to two billions,
         <command>VACUUM</> will silently limit the effective value to 95% of
         <xref linkend="guc-autovacuum-multixact-freeze-max-age">, so that a
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 5204b34..d742ec9 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -438,22 +438,27 @@
    </para>
 
    <para>
-    <command>VACUUM</> normally skips pages that don't have any dead row
-    versions, but those pages might still have row versions with old XID
-    values.  To ensure all old row versions have been frozen, a
-    scan of the whole table is needed.
-    <xref linkend="guc-vacuum-freeze-table-age"> controls when
-    <command>VACUUM</> does that: a whole table sweep is forced if
-    the table hasn't been fully scanned for <varname>vacuum_freeze_table_age</>
-    minus <varname>vacuum_freeze_min_age</> transactions. Setting it to 0
-    forces <command>VACUUM</> to always scan all pages, effectively ignoring
-    the visibility map.
+    <command>VACUUM</> uses the <link linkend="storage-vm">visibility map</>
+    to determine which pages of a relation must be scanned.  Normally, it
+    will skips pages that don't have any dead row versions even if those pages
+    might still have row versions with old XID values.  Therefore, normal
+    scans won't succeed in freezing every row version in the table.
+    Periodically, <command>VACUUM</> will perform an <firstterm>aggressive
+    vacuum</>, skipping only those pages which contain neither dead rows nor
+    any unfrozen XID or MXID values.
+    <xref linkend="guc-vacuum-freeze-table-age">
+    controls when <command>VACUUM</> does that: all-visible but not all-frozen
+    pages are scanned if the number of transactions that have passed since the
+    last such scan is greater than <varname>vacuum_freeze_table_age</> minus
+    <varname>vacuum_freeze_min_age</>. Setting
+    <varname>vacuum_freeze_table_age</> to 0 forces <command>VACUUM</> to
+    use this more aggressive strategy for all scans.
    </para>
 
    <para>
     The maximum time that a table can go unvacuumed is two billion
     transactions minus the <varname>vacuum_freeze_min_age</> value at
-    the time <command>VACUUM</> last scanned the whole table.  If it were to go
+    the time of the last aggressive vacuum. If it were to go
     unvacuumed for longer than
     that, data loss could result.  To ensure that this does not happen,
     autovacuum is invoked on any table that might contain unfrozen rows with
@@ -491,7 +496,7 @@
     normal delete and update activity is run in that window.  Setting it too
     close could lead to anti-wraparound autovacuums, even though the table
     was recently vacuumed to reclaim space, whereas lower values lead to more
-    frequent whole-table scans.
+    frequent aggressive vaccuuming.
    </para>
 
    <para>
@@ -527,7 +532,7 @@
     <structname>pg_database</>.  In particular,
     the <structfield>relfrozenxid</> column of a table's
     <structname>pg_class</> row contains the freeze cutoff XID that was used
-    by the last whole-table <command>VACUUM</> for that table.  All rows
+    by the last aggressive <command>VACUUM</> for that table.  All rows
     inserted by transactions with XIDs older than this cutoff XID are
     guaranteed to have been frozen.  Similarly,
     the <structfield>datfrozenxid</> column of a database's
@@ -554,18 +559,21 @@ SELECT datname, age(datfrozenxid) FROM pg_database;
    <para>
     <command>VACUUM</> normally
     only scans pages that have been modified since the last vacuum, but
-    <structfield>relfrozenxid</> can only be advanced when the whole table is
-    scanned. The whole table is scanned when <structfield>relfrozenxid</> is
-    more than <varname>vacuum_freeze_table_age</> transactions old, when
-    <command>VACUUM</>'s <literal>FREEZE</> option is used, or when all pages
-    happen to
+    <structfield>relfrozenxid</> can only be advanced every page of the table
+    that might contain unfrozen XIDs is scanned.  This happens when 
+    <structfield>relfrozenxid</> is more than
+    <varname>vacuum_freeze_table_age</> transactions old, when
+    <command>VACUUM</>'s <literal>FREEZE</> option is used, or when all
+    pages that are not already all-frozen happen to
     require vacuuming to remove dead row versions. When <command>VACUUM</>
-    scans the whole table, after it's finished <literal>age(relfrozenxid)</>
-    should be a little more than the <varname>vacuum_freeze_min_age</> setting
+    scans every page in the table that is not already all-frozen, it should
+    set <literal>age(relfrozenxid)</> to a value just a little more than the
+    <varname>vacuum_freeze_min_age</> setting
     that was used (more by the number of transactions started since the
-    <command>VACUUM</> started).  If no whole-table-scanning <command>VACUUM</>
-    is issued on the table until <varname>autovacuum_freeze_max_age</> is
-    reached, an autovacuum will soon be forced for the table.
+    <command>VACUUM</> started).  If no <structfield>relfrozenxid</>-advancing
+    <command>VACUUM</> is issued on the table until
+    <varname>autovacuum_freeze_max_age</> is reached, an autovacuum will soon
+    be forced for the table.
    </para>
 
    <para>
@@ -634,21 +642,23 @@ HINT:  Stop the postmaster and vacuum that database in single-user mode.
     </para>
 
     <para>
-     During a <command>VACUUM</> table scan, either partial or of the whole
-     table, any multixact ID older than
+     Whenever <command>VACUUM</> scans any part of a table, it will replace
+     any multixact ID it encounters which is older than
      <xref linkend="guc-vacuum-multixact-freeze-min-age">
-     is replaced by a different value, which can be the zero value, a single
+     by a different value, which can be the zero value, a single
      transaction ID, or a newer multixact ID.  For each table,
      <structname>pg_class</>.<structfield>relminmxid</> stores the oldest
      possible multixact ID still appearing in any tuple of that table.
      If this value is older than
-     <xref linkend="guc-vacuum-multixact-freeze-table-age">, a whole-table
-     scan is forced.  <function>mxid_age()</> can be used on
+     <xref linkend="guc-vacuum-multixact-freeze-table-age">, an aggressive
+     vacuum is forced.  As discussed in the previous section, an aggressive
+     vacuum means that only those pages which are known to be all-frozen will
+     be skipped.  <function>mxid_age()</> can be used on
      <structname>pg_class</>.<structfield>relminmxid</> to find its age.
     </para>
 
     <para>
-     Whole-table <command>VACUUM</> scans, regardless of
+     Aggressive <command>VACUUM</> scans, regardless of
      what causes them, enable advancing the value for that table.
      Eventually, as all tables in all databases are scanned and their
      oldest multixact values are advanced, on-disk storage for older
@@ -656,13 +666,13 @@ HINT:  Stop the postmaster and vacuum that database in single-user mode.
     </para>
 
     <para>
-     As a safety device, a whole-table vacuum scan will occur for any table
+     As a safety device, an aggressive vacuum scan will occur for any table
      whose multixact-age is greater than
-     <xref linkend="guc-autovacuum-multixact-freeze-max-age">.  Whole-table
+     <xref linkend="guc-autovacuum-multixact-freeze-max-age">.  Aggressive
      vacuum scans will also occur progressively for all tables, starting with
      those that have the oldest multixact-age, if the amount of used member
      storage space exceeds the amount 50% of the addressable storage space.
-     Both of these kinds of whole-table scans will occur even if autovacuum is
+     Both of these kinds of aggressive scans will occur even if autovacuum is
      nominally disabled.
     </para>
    </sect3>
@@ -743,9 +753,9 @@ vacuum threshold = vacuum base threshold + vacuum scale factor * number of tuple
     <command>UPDATE</command> and <command>DELETE</command> operation.  (It
     is only semi-accurate because some information might be lost under heavy
     load.)  If the <structfield>relfrozenxid</> value of the table is more
-    than <varname>vacuum_freeze_table_age</> transactions old, the whole
-    table is scanned to freeze old tuples and advance
-    <structfield>relfrozenxid</>, otherwise only pages that have been modified
+    than <varname>vacuum_freeze_table_age</> transactions old, an aggressive
+    vacuum is performed to freeze old tuples and advance
+    <structfield>relfrozenxid</>; otherwise, only pages that have been modified
     since the last vacuum are scanned.
    </para>
 
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to