Attached patch adds a new function to the pageinspect extension for measuring total free space, in either tables or indexes. It returns the free space as a percentage, so higher numbers mean more bloat. After trying a couple of ways to quantify it, I've found this particular measure correlates well with the nastiest bloat issues I've ran into in production recently. For example, an index that had swelled to over 5X its original size due to autovacuum issues registered at 0.86 on this scale. I could easily see people putting an alert at something like 0.40 and picking candidates to reindex based on it triggering. That would be about a million times smarter than how I've been muddling through this class of problems so far.

Code by Jaime Casanova, based on a prototype by me. Thanks to attendees and sponsors of the PgWest conference for helping to fund some deeper exploration of this idea.

Here's a test case showing it in action:

create extension pageinspect;
create table t (k serial,v integer);
insert into t(v) (select generate_series(1,100000));
create index t_idx on t(k);
delete from t where k<50000;
vacuum t;

gsmith=# select relation_free_space('t');
 relation_free_space
---------------------
            0.445466

gsmith=# select relation_free_space('t_idx');
 relation_free_space
---------------------
            0.550946

Some open questions in my mind:

-Given this is doing a full table scan, should it hook into a ring buffer to keep from trashing the buffer cache? Or might it loop over the relation in a different way all together? I was thinking about eyeing the FSM instead at one point, didn't explore that yet. There's certainly a few ways to approach this, we just aimed at the easiest way to get a working starter implementation, and associated results to compare others against.

-Should there be a non-superuser version of this? We'd certainly need to get a less cache demolishing version before that would seem wise.

-There were related things in the pageinspect module, but a case could be made for this being a core function instead. It's a bit more likely to be used in production than the rest of that extension.

-What if anything related to TOAST should this handle?

We're also planning to do a sampling version of this, using the same approach ANALYZE does. Grab a number of blocks, extrapolate from there. It shouldn't take many samples before the accuracy is better than how people are estimated this now. That work is just waiting on some better thinking about how to handle the full relation version first.

And, yes, the explanation in the docs and code should be clear that it's returning a percentage, which I just realized when writing this. At least I remembered to document something; still ahead of the average new patch...

--
Greg Smith   2ndQuadrant US    g...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us

diff --git a/contrib/pageinspect/btreefuncs.c b/contrib/pageinspect/btreefuncs.c
index dbb2158..aac9148 100644
*** a/contrib/pageinspect/btreefuncs.c
--- b/contrib/pageinspect/btreefuncs.c
***************
*** 34,39 ****
--- 34,40 ----
  #include "utils/builtins.h"
  #include "utils/rel.h"
  
+ #include "btreefuncs.h"
  
  extern Datum bt_metap(PG_FUNCTION_ARGS);
  extern Datum bt_page_items(PG_FUNCTION_ARGS);
*************** GetBTPageStatistics(BlockNumber blkno, B
*** 155,160 ****
--- 156,202 ----
  		stat->avg_item_size = 0;
  }
  
+ /*------------------------------------------------
+  * GetBTRelationFreeSpace
+  *
+  * Get the free space for a btree index.
+  * This is a helper function for relation_free_space()
+  *------------------------------------------------
+  */
+ float4
+ GetBTRelationFreeSpace(Relation rel)
+ {
+ 	BTPageStat stat;
+ 
+ 	Buffer		buffer;
+ 	BlockNumber blkno;
+ 	BlockNumber totalBlcksInRelation = RelationGetNumberOfBlocks(rel);
+ 	Size 		free_space = 0;
+ 	double		free_percent = 0;
+ 	
+ 	/* Skip page 0 because it is a metapage */
+ 	for (blkno = 1; blkno < totalBlcksInRelation; blkno++)
+ 	{
+ 		buffer = ReadBuffer(rel, blkno);
+ 		/* 
+ 		 * get the statistics of the indexes and use that info
+ 		 * to determine free space on the page
+ 		 */
+ 		GetBTPageStatistics(blkno, buffer, &stat);
+ 		if (stat.type == 'd')
+ 			free_space += stat.page_size;
+ 		else
+ 			free_space += stat.free_size;		
+ 
+ 		ReleaseBuffer(buffer);
+ 	}
+ 
+ 	if (totalBlcksInRelation > 1)
+ 		free_percent = (free_space * 1.0) / ((totalBlcksInRelation - 1) * BLCKSZ);
+ 	return free_percent;
+ }
+ 
+ 
  /* -----------------------------------------------
   * bt_page()
   *
diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c
index fa50655..c1d72ba 100644
*** a/contrib/pageinspect/heapfuncs.c
--- b/contrib/pageinspect/heapfuncs.c
***************
*** 28,33 ****
--- 28,36 ----
  #include "funcapi.h"
  #include "utils/builtins.h"
  #include "miscadmin.h"
+ #include "storage/bufmgr.h"
+ 
+ #include "heapfuncs.h"
  
  Datum		heap_page_items(PG_FUNCTION_ARGS);
  
*************** bits_to_text(bits8 *bits, int len)
*** 55,60 ****
--- 58,94 ----
  }
  
  
+ /*
+  * GetHeapRelationFreeSpace()
+  *
+  * Get the free space for a heap relation.
+  * This is a helper function for relation_free_space()
+  */
+ float4
+ GetHeapRelationFreeSpace(Relation rel)
+ {
+ 	Buffer      buffer;
+ 	Page		page;
+ 	BlockNumber blkno;
+ 	BlockNumber totalBlcksInRelation = RelationGetNumberOfBlocks(rel);
+ 	Size        free_space = 0;
+ 	double		free_percent = 0;
+ 
+ 	for (blkno = 0; blkno < totalBlcksInRelation; blkno++)
+ 	{
+ 		buffer = ReadBuffer(rel, blkno);
+ 		page   = BufferGetPage(buffer);
+ 		free_space += PageGetHeapFreeSpace(page);
+ 
+ 		ReleaseBuffer(buffer);
+ 	}
+ 
+ 	if (totalBlcksInRelation > 0)
+ 		free_percent = (free_space * 1.0) / (totalBlcksInRelation * BLCKSZ);
+ 	return free_percent;
+ }
+ 
+ 
  /*
   * heap_page_items
   *
diff --git a/contrib/pageinspect/pageinspect--1.0.sql b/contrib/pageinspect/pageinspect--1.0.sql
index 5613956..4502a13 100644
*** a/contrib/pageinspect/pageinspect--1.0.sql
--- b/contrib/pageinspect/pageinspect--1.0.sql
*************** CREATE FUNCTION fsm_page_contents(IN pag
*** 105,107 ****
--- 105,115 ----
  RETURNS text
  AS 'MODULE_PATHNAME', 'fsm_page_contents'
  LANGUAGE C STRICT;
+ 
+ --
+ -- relation_free_space()
+ --
+ CREATE FUNCTION relation_free_space(IN relname text)
+ RETURNS real
+ AS 'MODULE_PATHNAME', 'relation_free_space'
+ LANGUAGE C STRICT;
diff --git a/contrib/pageinspect/pageinspect--unpackaged--1.0.sql b/contrib/pageinspect/pageinspect--unpackaged--1.0.sql
index 13e2167..ccfe6e4 100644
*** a/contrib/pageinspect/pageinspect--unpackaged--1.0.sql
--- b/contrib/pageinspect/pageinspect--unpackaged--1.0.sql
*************** ALTER EXTENSION pageinspect ADD function
*** 29,31 ****
--- 29,32 ----
  ALTER EXTENSION pageinspect ADD function bt_page_stats(text,integer);
  ALTER EXTENSION pageinspect ADD function bt_page_items(text,integer);
  ALTER EXTENSION pageinspect ADD function fsm_page_contents(bytea);
+ ALTER EXTENSION pageinspect ADD function relation_free_space(integer);
diff --git a/contrib/pageinspect/rawpage.c b/contrib/pageinspect/rawpage.c
index 362ad84..988d1e3 100644
*** a/contrib/pageinspect/rawpage.c
--- b/contrib/pageinspect/rawpage.c
***************
*** 23,33 ****
--- 23,37 ----
  #include "utils/builtins.h"
  #include "utils/rel.h"
  
+ #include "btreefuncs.h"
+ #include "heapfuncs.h"
+ 
  PG_MODULE_MAGIC;
  
  Datum		get_raw_page(PG_FUNCTION_ARGS);
  Datum		get_raw_page_fork(PG_FUNCTION_ARGS);
  Datum		page_header(PG_FUNCTION_ARGS);
+ Datum		relation_free_space(PG_FUNCTION_ARGS);
  
  static bytea *get_raw_page_internal(text *relname, ForkNumber forknum,
  					  BlockNumber blkno);
*************** page_header(PG_FUNCTION_ARGS)
*** 227,229 ****
--- 231,285 ----
  
  	PG_RETURN_DATUM(result);
  }
+ 
+ 
+ /*
+  * relation_free_space
+  *
+  * Returns the percentage of free space for a given relation.
+  * Supported relation types are tables and indexes.  TOAST is not
+  * considered/handled.
+  */
+ PG_FUNCTION_INFO_V1(relation_free_space);
+ 
+ Datum
+ relation_free_space(PG_FUNCTION_ARGS)
+ {
+ 	text       *relname = PG_GETARG_TEXT_P(0);
+ 	RangeVar	*relrv;
+ 	Relation	rel;
+ 	float4 		free_space = 0;
+ 
+ 	relrv = makeRangeVarFromNameList(textToQualifiedNameList(relname));
+ 	rel = relation_openrv(relrv, AccessShareLock);
+ 
+ 	/* Only allow tables or indexes */
+ 	if (rel->rd_rel->relkind != RELKIND_INDEX && rel->rd_rel->relkind != RELKIND_RELATION)
+ 		ereport(ERROR,
+ 			(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ 				errmsg("cannot get free space from object \"%s\"",
+ 					RelationGetRelationName(rel))));
+ 
+ 	/*
+ 	 * Reject attempts to read non-local temporary relations.  We would be
+ 	 * likely to get wrong data, since we have no visibility into the owning
+ 	 * session's local buffers.
+ 	 */
+ 	if (RELATION_IS_OTHER_TEMP(rel))
+ 		ereport(ERROR,
+ 			(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ 				errmsg("cannot access temporary tables of other sessions")));
+ 
+ 	switch (rel->rd_rel->relkind)
+ 	{
+ 		case RELKIND_RELATION: 
+ 			free_space = GetHeapRelationFreeSpace(rel);
+ 			break;
+ 		case RELKIND_INDEX: 
+ 			free_space = GetBTRelationFreeSpace(rel);
+ 			break;
+ 	}
+ 	relation_close(rel, AccessShareLock);
+ 
+ 	PG_RETURN_FLOAT4(free_space);
+ }
diff --git a/doc/src/sgml/pageinspect.sgml b/doc/src/sgml/pageinspect.sgml
index acbb05b..49708c3 100644
*** a/doc/src/sgml/pageinspect.sgml
--- b/doc/src/sgml/pageinspect.sgml
*************** test=# SELECT * FROM bt_page_items('pg_c
*** 196,201 ****
--- 196,216 ----
       </para>
      </listitem>
     </varlistentry>
+ 
+    <varlistentry>
+     <term>
+      <function>relation_free_space(relname text) returns float</function>
+     </term>
+ 
+     <listitem>
+      <para>
+       <function>relation_free_space</function> returns the total amount of free
+       free space in a table or index.  The entire relation is scanned in order
+       to compute this amount, which may generate a large amount of disk reads.
+       Information stored in <acronym>TOAST</> tables is not included.
+      </para>
+     </listitem>
+    </varlistentry>
    </variablelist>
   </sect2>
  
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to