[HACKERS] Vacuum rate limit in KBps

Greg Smith Sun, 15 Jan 2012 00:25:04 -0800

So far the reaction I've gotten from my recent submission to makeautovacuum log its read/write in MB/s has been rather positive. I'vebeen surprised at the unprecedented (to me at least) amount ofbackporting onto big production systems it's gotten. There is a wholelot of pent up frustration among larger installs over not having goodvisibility into how changing cost-based vacuum parameters turns intoreal-world units.

That got me thinking: if MB/s is what everyone wants to monitor, can weprovide a UI to set these parameters that way too? The attached patchis a bit rough still, but it does that. The key was recognizing thatthe cost delay plus cost limit can be converted into an upper limit oncost units per second, presuming the writes themselves are free. If youthen also assume the worst case--that everything will end up dirty--bythrowing in the block size, too, you compute a maximum rate in MB/s.That represents the fastest you can possibly write.

If you then turn that equation around, making the maximum write rate theinput, for any given cost delay and dirty page cost you can solve forthe cost limit--the parameter in fictitious units everyone hates. Itworks like this, with the computation internals logged every time theyrun for now:


#vacuum_cost_rate_limit = 4000      # maximum write rate in kilobytes/second
LOG:  cost limit=200 based on rate limit=4000 KB/s delay=20 dirty cost=20

That's the same cost limit that was there before, except now it'sderived from that maximum write rate figure. vacuum_cost_limit is goneas a GUC, replaced with this new vacuum_cost_rate_limit. Internally,vacuum_cost_rate_limit hasn't gone anywhere though. All of the entrypoints into vacuum and autovacuum derive an internal-onlyVacuumCostLimit as part of any setup or rebalance operation. Butthere's no change to underlying cost management code; the cost limit isbudgeted and accounted for in exactly the same way as it always was.

Why is this set in kilobytes/second rather than using something based ona memory unit? That decision was made after noting these values canalso be set in relation options. Making relation options aware ofmemory unit math seemed ambitious relative to its usefulness, and it'snot like KB/s is hard to work with in this context.

OK, I lied; technically this is set in kibibytes per second right now.Ran out of energy before I got to confirming that was consistent withall similar GUC settings, will put on my pedantic hat later to check that.

One nice thing that falls out of this is that the *vacuum_cost_delaysettings essentially turn into a boolean. If the delay is 0, costlimits are off; set it to any other value, and the rate you get isdriven almost entirely by vacuum_cost_rate_limit (disclaimer mainlybecause of issues like sleep time accuracy are possible). You can seethat at work in these examples:


LOG:  cost limit=200 based on rate limit=4000 KB/s delay=20 dirty cost=20
LOG:  cost limit=100 based on rate limit=4000 KB/s delay=10 dirty cost=20

LOG:  cost limit=200 based on rate limit=4000 KB/s delay=20 dirty cost=20
LOG:  cost limit=100 based on rate limit=2000 KB/s delay=20 dirty cost=20

Halve the delay to 10, and the cost limit drops in half too to keep thesame I/O rate. Halve the rate limit instead, and the cost limit halveswith it. Most sites will never need to change the delay figure from20ms, they can just focus on tuning the more human-readable rate limitfigure instead. The main reason I thought of to keep the delay aroundas an integer still is sites trying to minimize power use, they mightincrease it from the normally used 20ms. I'm not as worried aboutpostgresql.conf settings bloat to support a valid edge use case, so longas most sites find a setting unnecessary to tune. And the autovacuumside of cost delay should fall into that category with this change.

Here's a full autovacuum log example. This shows how close to the KBpsrate the server actually got, along with the autovacuum cost balancingworking the same old way (this is after running the boringautovac-big.sql test case attached here too):

2012-01-15 02:10:51.905 EST: LOG: cost limit=200 based on ratelimit=4000 KB/s delay=20 dirty cost=202012-01-15 02:10:51.906 EST: DEBUG: autovac_balance_cost(pid=13054db=16384, rel=16444, cost_rate_limit=4000, cost_limit=200,cost_limit_base=200, cost_delay=20)

2012-01-15 02:11:05.127 EST: DEBUG: "t": removed 4999999 row versionsin 22124 pages2012-01-15 02:11:05.127 EST: DEBUG: "t": found 4999999 removable,5000001 nonremovable row versions in 44248 out of 44248 pages2012-01-15 02:11:05.127 EST: DETAIL: 0 dead row versions cannot beremoved yet.

    There were 0 unused item pointers.
    0 pages are entirely empty.
    CPU 0.27s/0.97u sec elapsed 131.73 sec.

2012-01-15 02:11:05.127 EST: LOG: automatic vacuum of table"gsmith.public.t": index scans: 0

    pages: 0 removed, 44248 remain
    tuples: 4999999 removed, 5000001 remain
    buffer usage: 48253 hits, 40296 misses, 43869 dirtied
    avg read rate: 2.390 MiB/s, avg write rate: 2.602 MiB/s
    system usage: CPU 0.27s/0.97u sec elapsed 131.73 sec

I think this new setting suggests the recently adding logging is missinga combined I/O figure, something that measures reads + writes over thetime period. This is good enough to demonstrate the sort of UI I wasaiming for in action though. Administrator says "don't write more than4MiB/s", and when autovacuum kicks in it averages 2.4 read + 2.6 write.

I see this change as taking something that feels like black arts tuningmagic now and turning it into a simple interface that's for the mostpart intuitive. None of the flexibility is lost here: you can stillretune the relative dirty vs. miss vs. hit costs, you have the option ofreducing the delay to a small value on a busy server where small sleepvalues are possible. But you don't have to do any of that just to tweakautovacuum up or down at a gross level; you can just turn the simple "atmost this much write I/O" knob instead.


All implementation notes and concerns from here down.

The original cost limit here defaulted to 200 and allowed a range of 1to 10000. The new default of 4000 show these values need to be 20X aslarge. The maximum was adjusted to 200000 KBps. Look at that, themaximum rate you can run cost delay vacuum at is 200MB/s; there'sanother good example of something that used to be mysterious to computethat is obvious now.

I didn't adjust the lower limit downward, so it's actually possible toset the new code to only operate at 1/200 the minimum speed you couldset before. On the balance this is surely a reduction in foot gunaiming though, and I could make the minimum 200 to eliminate it. Seemsa needless detail to worry about.

This code is new and just complicated enough that there's surely someedge cases I broke here. In particular I haven't put together a goodconcurrent autovacuum test yet to really prove all the balancing logicstill works correctly. Need to test that with a settings change in themiddle of a long vacuum too.

There's one serious concern I don't have a quick answer to. What do wedo with in-place upgrade of relations that specified a customvacuum_cost_limit? I can easily chew on getting the right logic toconvert those to equals in the new setting style, but I am not preparedto go solely on the hook for all in-place upgrade work one might dohere. Would this be easiest to handle as one of those dump/restoretransformations? My guess is that's more sensible than the alternativeof making an on-read converter that only writes in the new format, thenworrying about upgrading all old pages before moving forward. Whilethis could be an interesting small test case for that sort of thing, I'drather not be patient #1 for that part of the long-term in-place upgradepath right now.


--
Greg Smith   2ndQuadrant US    g...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 0cc3296..98358e2 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1165,8 +1165,8 @@ SET ENABLE_SEQSCAN TO OFF;
       commands, the system maintains an
       internal counter that keeps track of the estimated cost of the
       various I/O operations that are performed.  When the accumulated
-      cost reaches a limit (specified by
-      <varname>vacuum_cost_limit</varname>), the process performing
+      cost reaches a limit (bounded by
+      <varname>vacuum_cost_rate_limit</varname>), the process performing
       the operation will sleep for a short period of time, as specified by
       <varname>vacuum_cost_delay</varname>. Then it will reset the
       counter and continue execution.
@@ -1200,7 +1200,7 @@ SET ENABLE_SEQSCAN TO OFF;
        <listitem>
         <para>
          The length of time, in milliseconds, that the process will sleep
-         when the cost limit has been exceeded.
+         when the cost rate limit has been exceeded.
          The default value is zero, which disables the cost-based vacuum
          delay feature.  Positive values enable cost-based vacuuming.
          Note that on many systems, the effective resolution
@@ -1212,9 +1212,20 @@ SET ENABLE_SEQSCAN TO OFF;
 
         <para>
          When using cost-based vacuuming, appropriate values for
-         <varname>vacuum_cost_delay</> are usually quite small, perhaps
-         10 or 20 milliseconds.  Adjusting vacuum's resource consumption
-         is best done by changing the other vacuum cost parameters.
+         <varname>vacuum_cost_delay</> are usually quite small, with
+         20 milliseconds (the default setting for 
+         <varname>autovacuum_vacuum_cost_delay</varname>) being appropriate
+         for most systems.  Adjusting vacuum's resource consumption is best
+         done by changing the other vacuum cost parameters, normally starting 
+          with<varname>vacuum_cost_rate_limit</varname>.  Since that rate
+         limit takes into account this delay sleep time, changes to
+         <varname>vacuum_cost_delay</> beyond whether or not it is
+         zero are not expected to usefully change vacuum behavior.
+         The main reason to use a sleep time value higher than 20
+         milliseconds is to lower server power consumption on a lightly
+         loaded system.  The maximum amount of work that can be done will not
+         change by doing that, but lower process wakeup frequency can allow
+         more efficient processor sleeping cycles.
         </para>
        </listitem>
       </varlistentry>
@@ -1264,15 +1275,32 @@ SET ENABLE_SEQSCAN TO OFF;
        </listitem>
       </varlistentry>
 
-      <varlistentry id="guc-vacuum-cost-limit" xreflabel="vacuum_cost_limit">
-       <term><varname>vacuum_cost_limit</varname> (<type>integer</type>)</term>
+      <varlistentry id="guc-vacuum-cost-rate-limit" xreflabel="vacuum_cost_rate_limit">
+       <term><varname>vacuum_cost_rate_limit</varname> (<type>integer</type>)</term>
        <indexterm>
-        <primary><varname>vacuum_cost_limit</> configuration parameter</primary>
+        <primary><varname>vacuum_cost_rate_limit</> configuration parameter</primary>
        </indexterm>
        <listitem>
         <para>
-         The accumulated cost that will cause the vacuuming process to sleep.
-         The default value is 200.
+         The maximum possible speed cost rate limited vacuum can write dirty
+         blocks to disk, in kilobytes per second. Each time vacuum wakes,
+         it uses this rate limit to set an internal cost limit (referred
+         to as internal_cost_limit below), assuming a worst-case scenario
+         where every block it accesses is dirty. The default value is
+         4000 kilobytes per second.
+        </para>
+        <para>
+         The all dirty block cost limit computation takes into account only
+         <varname>vacuum_cost_delay</> and <varname>vacuum_cost_page_dirty</>.
+         How much work a cost-based vacuum could do in other situations
+         depends on the ratio of <varname>vacuum_cost_page_dirty</> to
+         the other page cost parameters. For example, assume the default
+         parameters where <varname>vacuum_cost_page_dirty</> is 20 and
+         <varname>vacuum_cost_page_miss</> is 10. A vacuum that was
+         only encountering only page miss reads in that configuration could
+         do twice as many of them per second as the dirty writes, effectively
+         making for a read rate of 8000 kilobytes per second, while still
+         staying under the rate limit.
         </para>
        </listitem>
       </varlistentry>
@@ -1287,7 +1315,7 @@ SET ENABLE_SEQSCAN TO OFF;
        limit.  To avoid uselessly long delays in such cases, the actual
        delay is calculated as <varname>vacuum_cost_delay</varname> *
        <varname>accumulated_balance</varname> /
-       <varname>vacuum_cost_limit</varname> with a maximum of
+       internal_cost_limit with a maximum of
        <varname>vacuum_cost_delay</varname> * 4.
       </para>
      </note>
@@ -4488,17 +4516,17 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
       </listitem>
      </varlistentry>
 
-     <varlistentry id="guc-autovacuum-vacuum-cost-limit" xreflabel="autovacuum_vacuum_cost_limit">
-      <term><varname>autovacuum_vacuum_cost_limit</varname> (<type>integer</type>)</term>
+     <varlistentry id="guc-autovacuum-vacuum-cost-rate-limit" xreflabel="autovacuum_vacuum_cost_rate_limit">
+      <term><varname>autovacuum_vacuum_cost_rate_limit</varname> (<type>integer</type>)</term>
       <indexterm>
-       <primary><varname>autovacuum_vacuum_cost_limit</> configuration parameter</primary>
+       <primary><varname>autovacuum_vacuum_cost_rate_limit</> configuration parameter</primary>
       </indexterm>
       <listitem>
        <para>
-        Specifies the cost limit value that will be used in automatic
-        <command>VACUUM</> operations.  If -1 is specified (which is the
-        default), the regular
-        <xref linkend="guc-vacuum-cost-limit"> value will be used.  Note that
+        Specifies the maximum possible speed that vacuum can write dirty
+         blocks to disk during automatic <command>VACUUM</> operations.
+         If -1 is specified (which is the default), the regular
+        <xref linkend="guc-vacuum-cost-rate-limit"> value will be used.  Note that
         the value is distributed proportionally among the running autovacuum
         workers, if there is more than one, so that the sum of the limits of
         each worker never exceeds the limit on this variable.
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 03cc6c9..debb844 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -671,7 +671,7 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu
     to prevent transaction ID wraparound.
     Another two parameters,
     <varname>autovacuum_vacuum_cost_delay</> and
-    <varname>autovacuum_vacuum_cost_limit</>, are used to set
+    <varname>autovacuum_vacuum_cost_rate_limit</>, are used to set
     table-specific values for the cost-based vacuum delay feature
     (see <xref linkend="runtime-config-resource-vacuum-cost">).
     <varname>autovacuum_freeze_min_age</>,
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index f55a001..a32a251 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -958,10 +958,10 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
    </varlistentry>
 
    <varlistentry>
-    <term><literal>autovacuum_vacuum_cost_limit</>, <literal>toast.autovacuum_vacuum_cost_limit</literal> (<type>integer</>)</term>
+    <term><literal>autovacuum_vacuum_cost_rate_limit</>, <literal>toast.autovacuum_vacuum_cost_rate_limit</literal> (<type>integer</>)</term>
     <listitem>
      <para>
-     Custom <xref linkend="guc-autovacuum-vacuum-cost-limit"> parameter.
+     Custom <xref linkend="guc-autovacuum-vacuum-cost-rate-limit"> parameter.
      </para>
     </listitem>
    </varlistentry>
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 09a7b6f..f97e3ae 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -147,8 +147,8 @@ static relopt_int intRelOpts[] =
 	},
 	{
 		{
-			"autovacuum_vacuum_cost_limit",
-			"Vacuum cost amount available before napping, for autovacuum",
+			"autovacuum_vacuum_cost_rate_limit",
+			"Vacuum maximum write rate per second, for autovacuum",
 			RELOPT_KIND_HEAP | RELOPT_KIND_TOAST
 		},
 		-1, 1, 10000
@@ -1137,8 +1137,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
 		offsetof(StdRdOptions, autovacuum) +offsetof(AutoVacOpts, analyze_threshold)},
 		{"autovacuum_vacuum_cost_delay", RELOPT_TYPE_INT,
 		offsetof(StdRdOptions, autovacuum) +offsetof(AutoVacOpts, vacuum_cost_delay)},
-		{"autovacuum_vacuum_cost_limit", RELOPT_TYPE_INT,
-		offsetof(StdRdOptions, autovacuum) +offsetof(AutoVacOpts, vacuum_cost_limit)},
+		{"autovacuum_vacuum_cost_rate_limit", RELOPT_TYPE_INT,
+		offsetof(StdRdOptions, autovacuum) +offsetof(AutoVacOpts, vacuum_cost_rate_limit)},
 		{"autovacuum_freeze_min_age", RELOPT_TYPE_INT,
 		offsetof(StdRdOptions, autovacuum) +offsetof(AutoVacOpts, freeze_min_age)},
 		{"autovacuum_freeze_max_age", RELOPT_TYPE_INT,
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 353af50..697710b 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -54,7 +54,6 @@
 int			vacuum_freeze_min_age;
 int			vacuum_freeze_table_age;
 
-
 /* A few variables that don't seem worth passing around as parameters */
 static MemoryContext vac_context = NULL;
 static BufferAccessStrategy vac_strategy;
@@ -211,12 +210,12 @@ vacuum(VacuumStmt *vacstmt, Oid relid, bool do_toast,
 	PG_TRY();
 	{
 		ListCell   *cur;
-
 		VacuumCostActive = (VacuumCostDelay > 0);
 		VacuumCostBalance = 0;
 		VacuumPageHit = 0;
 		VacuumPageMiss = 0;
 		VacuumPageDirty = 0;
+		VacuumCostLimit = cost_limit(VacuumCostRateLimit,VacuumCostDelay);
 
 		/*
 		 * Loop to process each selected relation.
@@ -297,6 +296,7 @@ vacuum(VacuumStmt *vacstmt, Oid relid, bool do_toast,
 	vac_context = NULL;
 }
 
+
 /*
  * Build a list of Oids for each relation to be processed
  *
@@ -1150,6 +1150,54 @@ vac_close_indexes(int nindexes, Relation *Irel, LOCKMODE lockmode)
 }
 
 /*
+ * Compute an internal vacuum cost limit from the passed inputs,
+ * system constants, and GUC values that can only be set there.
+ * 
+ * Some inputs to this computation can be set in up to three ways:
+ * the regular vacuum settings, the autovacuum settings, and as
+ * a relation option. Anything that can have a different value in those
+ * contexts should be passed into here as a parameter. We can't be expected
+ * to figure that out here, and all of the callers expected will have
+ * checked all of that out anyway.
+ */
+int cost_limit(int rate_limit, int cost_delay)
+{
+	int limit;
+	
+	/*
+	 * Take in a rate in kibibytes (multiples of 1024 bytes) per second,
+	 * output a cost limit that will produce that amount of write I/O
+	 * if all pages are dirty.  The 1000 here adjusts for cost
+	 * delay times being in milliseconds.
+	 */	
+	Assert(BLCKSZ > 0);
+	limit = rate_limit * 1024 * VacuumCostPageDirty *
+				cost_delay / (1000 * BLCKSZ);
+
+	/*
+	 * In situations where the cost delay is 0, cost accounting shouldn't
+	 * act on the value returned here.  But in that case, as well as when
+	 * dirty pages are considered 0 cost, this computation will compute a
+	 * limit of 0.  What page cost accounting will do in that dysfunctional
+	 * case isn't well explored, since the original VacuumCostLimit GUC value
+	 * was clamped with a minimum of 1.  Let's not find out if division by
+	 * zero types of badness are possible if we push an expected <=0
+	 * downstream by keeping the original floor here too.
+	 */
+	if (limit < 0)
+		limit = 1;
+
+	/*
+	 * XXX Log obstrusively for patch review.  Similarly chatty
+	 * messages from autovac_balance_cost are at DEBUG2
+	 */	
+	ereport(LOG,
+			(errmsg("cost limit=%d based on rate limit=%d KB/s delay=%d dirty cost=%d",
+					limit, rate_limit, cost_delay, VacuumCostPageDirty)));
+	return limit;
+}
+
+/*
  * vacuum_delay_point --- check for interrupts and cost-based delay.
  *
  * This should be called in each major loop of VACUUM processing,
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index f858a6d..bc741a7 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -114,10 +114,13 @@ double		autovacuum_anl_scale;
 int			autovacuum_freeze_max_age;
 
 int			autovacuum_vac_cost_delay;
-int			autovacuum_vac_cost_limit;
+int			autovacuum_vac_cost_rate_limit;
 
 int			Log_autovacuum_min_duration = -1;
 
+/* Derived from GUC parameters */
+int			autovacuum_vac_cost_limit;
+
 /* how long to keep pgstat data in the launcher, in milliseconds */
 #define STATS_READ_DELAY 1000
 
@@ -1727,17 +1730,24 @@ autovac_balance_cost(void)
 	 * that a worker can consume is determined by cost_limit/cost_delay, so we
 	 * try to equalize those ratios rather than the raw limit settings.
 	 *
-	 * note: in cost_limit, zero also means use value from elsewhere, because
+	 * note: in cost_rate_limit, zero also means use value from elsewhere, because
 	 * zero is not a valid value.
 	 */
-	int			vac_cost_limit = (autovacuum_vac_cost_limit > 0 ?
-								autovacuum_vac_cost_limit : VacuumCostLimit);
+	int			vac_cost_rate_limit = (autovacuum_vac_cost_rate_limit > 0 ?
+								autovacuum_vac_cost_rate_limit : VacuumCostRateLimit);
 	int			vac_cost_delay = (autovacuum_vac_cost_delay >= 0 ?
 								autovacuum_vac_cost_delay : VacuumCostDelay);
+	int			vac_cost_limit;
 	double		cost_total;
 	double		cost_avail;
 	WorkerInfo	worker;
 
+	/*
+	 * Starting with the kilobytes/second setting for rate limit,
+	 * recompute the limit used internally here in cost units
+	 */
+	vac_cost_limit = cost_limit(vac_cost_rate_limit, vac_cost_delay);
+
 	/* not set? nothing to do */
 	if (vac_cost_limit <= 0 || vac_cost_delay <= 0)
 		return;
@@ -1763,8 +1773,8 @@ autovac_balance_cost(void)
 		return;
 
 	/*
-	 * Adjust cost limit of each active worker to balance the total of cost
-	 * limit to autovacuum_vacuum_cost_limit.
+	 * Adjust cost limit of each active worker, balance the total to the
+	 * cost unit limit derived from autovacuum_vacuum_cost_rate_limit
 	 */
 	cost_avail = (double) vac_cost_limit / vac_cost_delay;
 	worker = (WorkerInfo) SHMQueueNext(&AutoVacuumShmem->av_runningWorkers,
@@ -1788,9 +1798,9 @@ autovac_balance_cost(void)
 											worker->wi_cost_limit_base),
 										1);
 
-			elog(DEBUG2, "autovac_balance_cost(pid=%u db=%u, rel=%u, cost_limit=%d, cost_limit_base=%d, cost_delay=%d)",
+			elog(DEBUG2, "autovac_balance_cost(pid=%u db=%u, rel=%u, cost_rate_limit=%d, cost_limit=%d, cost_limit_base=%d, cost_delay=%d)",
 				 worker->wi_proc->pid, worker->wi_dboid, worker->wi_tableoid,
-				 worker->wi_cost_limit, worker->wi_cost_limit_base,
+				 vac_cost_rate_limit, worker->wi_cost_limit, worker->wi_cost_limit_base,
 				 worker->wi_cost_delay);
 		}
 
@@ -2251,7 +2261,7 @@ do_autovacuum(void)
 		LWLockRelease(AutovacuumScheduleLock);
 
 		/*
-		 * Remember the prevailing values of the vacuum cost GUCs.	We have to
+		 * Remember the prevailing values of the vacuum cost values. We have to
 		 * restore these at the bottom of the loop, else we'll compute wrong
 		 * values in the next iteration of autovac_balance_cost().
 		 */
@@ -2360,7 +2370,7 @@ deleted:
 		MyWorkerInfo->wi_tableoid = InvalidOid;
 		LWLockRelease(AutovacuumLock);
 
-		/* restore vacuum cost GUCs for the next iteration */
+		/* restore vacuum cost parameters for the next iteration */
 		VacuumCostDelay = stdVacuumCostDelay;
 		VacuumCostLimit = stdVacuumCostLimit;
 	}
@@ -2497,7 +2507,9 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 	{
 		int			freeze_min_age;
 		int			freeze_table_age;
+		int			vac_cost_rate_limit;
 		int			vac_cost_limit;
+
 		int			vac_cost_delay;
 
 		/*
@@ -2514,12 +2526,14 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 			? autovacuum_vac_cost_delay
 			: VacuumCostDelay;
 
-		/* 0 or -1 in autovac setting means use plain vacuum_cost_limit */
-		vac_cost_limit = (avopts && avopts->vacuum_cost_limit > 0)
-			? avopts->vacuum_cost_limit
-			: (autovacuum_vac_cost_limit > 0)
-			? autovacuum_vac_cost_limit
-			: VacuumCostLimit;
+		/* 0 or -1 in autovac setting means compute from vacuum_cost_rate_limit */
+		vac_cost_rate_limit = (avopts && avopts->vacuum_cost_rate_limit > 0)
+			? avopts->vacuum_cost_rate_limit
+			: (autovacuum_vac_cost_rate_limit > 0)
+			? autovacuum_vac_cost_rate_limit
+			: VacuumCostRateLimit;
+
+		vac_cost_limit = cost_limit(vac_cost_rate_limit, vac_cost_delay);
 
 		/* these do not have autovacuum-specific settings */
 		freeze_min_age = (avopts && avopts->freeze_min_age >= 0)
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 4b66bd3..d93f4bf 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -113,7 +113,7 @@ int			MaxConnections = 90;
 int			VacuumCostPageHit = 1;		/* GUC parameters for vacuum */
 int			VacuumCostPageMiss = 10;
 int			VacuumCostPageDirty = 20;
-int			VacuumCostLimit = 200;
+int			VacuumCostRateLimit = 4000;
 int			VacuumCostDelay = 0;
 
 int			VacuumPageHit = 0;
@@ -122,6 +122,7 @@ int			VacuumPageDirty = 0;
 
 int			VacuumCostBalance = 0;		/* working state for vacuum */
 bool		VacuumCostActive = false;
+int			VacuumCostLimit = -1;	/* XXX Invalid default, make sure it was updated */
 
 int			GinFuzzySearchLimit = 0;
 
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 5c910dd..ce33f1e 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1751,12 +1751,12 @@ static struct config_int ConfigureNamesInt[] =
 	},
 
 	{
-		{"vacuum_cost_limit", PGC_USERSET, RESOURCES_VACUUM_DELAY,
-			gettext_noop("Vacuum cost amount available before napping."),
+		{"vacuum_cost_rate_limit", PGC_USERSET, RESOURCES_VACUUM_DELAY,
+			gettext_noop("Vacuum write rate limit to schedule napping frequency."),
 			NULL
 		},
-		&VacuumCostLimit,
-		200, 1, 10000,
+		&VacuumCostRateLimit,
+		4000, 1, 200000,
 		NULL, NULL, NULL
 	},
 
@@ -1783,12 +1783,12 @@ static struct config_int ConfigureNamesInt[] =
 	},
 
 	{
-		{"autovacuum_vacuum_cost_limit", PGC_SIGHUP, AUTOVACUUM,
-			gettext_noop("Vacuum cost amount available before napping, for autovacuum."),
+		{"autovacuum_vacuum_cost_rate_limit", PGC_SIGHUP, AUTOVACUUM,
+			gettext_noop("Vacuum write rate limit to schedule napping frequency, for autovacuum."),
 			NULL
 		},
-		&autovacuum_vac_cost_limit,
-		-1, -1, 10000,
+		&autovacuum_vac_cost_rate_limit,
+		-1, -1, 200000,
 		NULL, NULL, NULL
 	},
 
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 315db46..9ced7ee 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -136,7 +136,7 @@
 #vacuum_cost_page_hit = 1		# 0-10000 credits
 #vacuum_cost_page_miss = 10		# 0-10000 credits
 #vacuum_cost_page_dirty = 20		# 0-10000 credits
-#vacuum_cost_limit = 200		# 1-10000 credits
+#vacuum_cost_rate_limit = 4000		# maximum write rate in kilobytes/second
 
 # - Background Writer -
 
@@ -458,9 +458,9 @@
 #autovacuum_vacuum_cost_delay = 20ms	# default vacuum cost delay for
 					# autovacuum, in milliseconds;
 					# -1 means use vacuum_cost_delay
-#autovacuum_vacuum_cost_limit = -1	# default vacuum cost limit for
+#autovacuum_vacuum_cost_rate_limit = -1	# default vacuum cost rate limit for
 					# autovacuum, -1 means use
-					# vacuum_cost_limit
+					# vacuum_cost_rate_limit
 
 
 #------------------------------------------------------------------------------
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index a27ef69..b71cd58 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1331,7 +1331,7 @@ psql_completion(char *text, int start, int end)
 			"autovacuum_freeze_min_age",
 			"autovacuum_freeze_table_age",
 			"autovacuum_vacuum_cost_delay",
-			"autovacuum_vacuum_cost_limit",
+			"autovacuum_vacuum_cost_rate_limit",
 			"autovacuum_vacuum_scale_factor",
 			"autovacuum_vacuum_threshold",
 			"fillfactor",
@@ -1340,7 +1340,7 @@ psql_completion(char *text, int start, int end)
 			"toast.autovacuum_freeze_min_age",
 			"toast.autovacuum_freeze_table_age",
 			"toast.autovacuum_vacuum_cost_delay",
-			"toast.autovacuum_vacuum_cost_limit",
+			"toast.autovacuum_vacuum_cost_rate_limit",
 			"toast.autovacuum_vacuum_scale_factor",
 			"toast.autovacuum_vacuum_threshold",
 			NULL
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 4526648..472d1dc 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -158,8 +158,10 @@ extern void vacuum_set_xid_limits(int freeze_min_age, int freeze_table_age,
 					  TransactionId *freezeLimit,
 					  TransactionId *freezeTableLimit);
 extern void vac_update_datfrozenxid(void);
+extern int cost_limit(int rate_limit, int cost_delay);
 extern void vacuum_delay_point(void);
 
+
 /* in commands/vacuumlazy.c */
 extern void lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt,
 				BufferAccessStrategy bstrategy);
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 610cb59..732a320 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -234,7 +234,7 @@ extern PGDLLIMPORT int maintenance_work_mem;
 extern int	VacuumCostPageHit;
 extern int	VacuumCostPageMiss;
 extern int	VacuumCostPageDirty;
-extern int	VacuumCostLimit;
+extern int	VacuumCostRateLimit;
 extern int	VacuumCostDelay;
 
 extern int	VacuumPageHit;
@@ -243,6 +243,7 @@ extern int	VacuumPageDirty;
 
 extern int	VacuumCostBalance;
 extern bool VacuumCostActive;
+extern int	VacuumCostLimit;
 
 
 /* in tcop/postgres.c */
diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h
index 8009fde..c827a2b 100644
--- a/src/include/postmaster/autovacuum.h
+++ b/src/include/postmaster/autovacuum.h
@@ -25,7 +25,7 @@ extern int	autovacuum_anl_thresh;
 extern double autovacuum_anl_scale;
 extern int	autovacuum_freeze_max_age;
 extern int	autovacuum_vac_cost_delay;
-extern int	autovacuum_vac_cost_limit;
+extern int	autovacuum_vac_cost_rate_limit;
 
 /* autovacuum launcher PID, only valid when worker is shutting down */
 extern int	AutovacuumLauncherPid;
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index d404c2a..f283116 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -183,7 +183,7 @@ typedef struct AutoVacOpts
 	int			vacuum_threshold;
 	int			analyze_threshold;
 	int			vacuum_cost_delay;
-	int			vacuum_cost_limit;
+	int			vacuum_cost_rate_limit;
 	int			freeze_min_age;
 	int			freeze_max_age;
 	int			freeze_table_age;

drop table t;
create table t (k serial,v integer);
insert into t(v) (select generate_series(1,10000000));
delete from t where k<5000000;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Vacuum rate limit in KBps

Reply via email to