[EMAIL PROTECTED] ("Matthew T. O'Connor") writes:
> OK, well as we wait on the fix for the stats system, let me submit my
> patch for pg_autovacuum.  This patch assumes that the stats system will
> be fixed so that all inserts, updates and deletes performed on shared
> tables reguardless of what database those commands were executed from,
> will show up in the stats shown in each database.

I had to make a further change to this to take quotes off the 'last
ANALYZE' in order for it to not overquote the relation name, so
there's a _little_ work left to get it to play well.

I have deployed it onto several boxes that should be doing some
vacuuming over the weekend, and it is now certainly hitting pg_
tables.

I would like to present a CVS-oriented patch; unfortunately, I had to
change the indentation patterns when editing some of it :-(.  The
following _may_ be good; not sure...

? pg_autovacuum
Index: README.pg_autovacuum
===================================================================
RCS file: /projects/cvsroot/pgsql-server/contrib/pg_autovacuum/README.pg_autovacuum,v
retrieving revision 1.2
diff -c -b -r1.2 README.pg_autovacuum
*** README.pg_autovacuum        12 Jun 2003 01:36:44 -0000      1.2
--- README.pg_autovacuum        12 Sep 2003 22:45:37 -0000
***************
*** 2,67 ****
  --------------------
  
  pg_autovacuum is a libpq client program that monitors all the
! databases associated with a postgresql server.  It uses the stats
  collector to monitor insert, update and delete activity.  
  
! When a table exceeds its insert or delete threshold (more detail
! on thresholds below) then that table will be  vacuumed or analyzed.  
! 
! This allows postgresql to keep the fsm and table statistics up to
! date, and eliminates the need to schedule periodic vacuums.
  
  The primary benefit of pg_autovacuum is that the FSM and table
! statistic information are updated as needed.  When a table is actively
! changing, pg_autovacuum will perform the necessary vacuums and
! analyzes, whereas if a table remains static, no cycles will be wasted
! performing unnecessary vacuums/analyzes.
  
  A secondary benefit of pg_autovacuum is that it ensures that a
! database wide vacuum is performed prior to xid wraparound.  This is an
  important, if rare, problem, as failing to do so can result in major
! data loss.
! 
  
  KNOWN ISSUES:
  -------------
- pg_autovacuum has been tested under Redhat Linux (by me) and Solaris (by
- Christopher B. Browne) and all known bugs have been resolved.  Please report
- any problems to the hackers list.
- 
- pg_autovacuum does not get started automatically by either the postmaster or
- by pg_ctl.  Along the sames lines, when the postmaster exits no one tells
- pg_autovacuum.  The result is that at the start of the next loop,
- pg_autovacuum fails to connect to the server and exits.  Any time  it fails
- to connect pg_autovacuum exits.
- 
- pg_autovacuum requires that the stats system be enabled and reporting row
- level stats.  The overhead of the stats system has been shown to be
- significant under certain workloads.  For instance a tight loop of queries
- performing "select 1" was nearly 30% slower with stats enabled.  However,
- in practice with more realistic workloads, the stats system overhead is
- usually nominal.
  
  
  INSTALL:
  --------
  
! As of postgresql v7.4 pg_autovacuum is included in the main source tree
! under contrib.  Therefore you just make && make install (similar to most other
! contrib modules) and it will be installed for you.
! 
! If you are using an earlier version of postgresql just uncompress the tar.gz
! into the contrib directory and modify the contrib/Makefile to include the 
pg_autovacuum
! directory.  pg_autovacuum will then be made as part of the standard
! postgresql install.
  
! make sure that the folowing are set in postgresql.conf
  
    stats_start_collector = true
    stats_row_level = true
  
! start up the postmaster, then execute the pg_autovacuum executable.
  
  
  Command line arguments:
  -----------------------
--- 2,98 ----
  --------------------
  
  pg_autovacuum is a libpq client program that monitors all the
! databases associated with a PostgreSQL server.  It uses the statistics
  collector to monitor insert, update and delete activity.
  
! When a table exceeds a insert or delete threshold (for more detail on
! thresholds, see "Vacuum and Analyze" below) then that table will be
! vacuumed and/or analyzed.
! 
! This allows PostgreSQL to keep the FSM (Free Space Map) and table
! statistics up to date, and eliminates the need to schedule periodic
! vacuums.
  
  The primary benefit of pg_autovacuum is that the FSM and table
! statistic information are updated more nearly as frequently as needed.
! When a table is actively changing, pg_autovacuum will perform the
! VACUUMs and ANALYZEs that such a table needs, whereas if a table
! remains static, no cycles will be wasted performing this
! unnecessarily.
  
  A secondary benefit of pg_autovacuum is that it ensures that a
! database wide vacuum is performed prior to XID wraparound.  This is an
  important, if rare, problem, as failing to do so can result in major
! data loss.  (See the section in the _Administrator's Guide_ entitled
! "Preventing transaction ID wraparound failures" for more details.)
  
  KNOWN ISSUES:
  -------------
  
+ pg_autovacuum has been tested under Redhat Linux (by me) and Debian
+ GNU/Linux and Solaris (by Christopher B. Browne) and all known bugs
+ have been resolved.  Please report any problems to the hackers list.
+ 
+ pg_autovacuum requires that the statistics system be enabled and
+ reporting row level stats.  The overhead of the stats system has been
+ shown to be significant costly under certain workloads.  For instance,
+ a tight loop of queries performing "select 1" was found to run nearly
+ 30% slower when stats were enabled.  However, in practice, with more
+ realistic workloads, the stats system overhead is usually nominal.
+ 
+ pg_autovacuum does not get started automatically by either the
+ postmaster or by pg_ctl.  Similarly, when the postmaster exits, no one
+ tells pg_autovacuum.  The result of that is that at the start of the
+ next loop, pg_autovacuum will fail to connect to the server and
+ exit().  Any time it fails to connect pg_autovacuum exit()s.
+ 
+ While pg_autovacuum can manage vacuums for as many databases as you
+ may have tied to a particular PostgreSQL postmaster, it can only
+ connect to a single PostgreSQL postmaster.  Thus, if you have multiple
+ postmasters on a particular host, you will need multiple pg_autovacuum
+ instances, and they have no way, at present, to coordinate between one
+ another to ensure that they do not concurrently vacuum big tables.
+ 
+ TODO:
+ -----
+ 
+ At present, there are no sample scripts to automatically start up
+ pg_autovacuum along with the database.  It would be desirable to have
+ a SysV script to start up pg_autovacuum after PostgreSQL has been
+ started.
+ 
+ Some users have expressed interest in making pg_autovacuum more
+ configurable so that certain tables known to be inactive could be
+ excluded from being vacuumed.  It would probably make sense to
+ introduce this sort of functionality by providing arguments to specify
+ the database and schema in which to find a configuration table.
  
  INSTALL:
  --------
  
! As of postgresql v7.4 pg_autovacuum is included in the main source
! tree under contrib.  Therefore you merely need to "make && make
! install" (similar to most other contrib modules) and it will be
! installed for you.
! 
! If you are using an earlier version of PostgreSQL, uncompress the
! tar.gz file into the contrib directory and modify the contrib/Makefile
! to include the pg_autovacuum directory.  pg_autovacuum will then be
! built as part of the standard postgresql install.
  
! make sure that the following are set in postgresql.conf:
  
    stats_start_collector = true
    stats_row_level = true
  
! Start up the postmaster, then execute the pg_autovacuum executable.
! 
! If you have a script that automatically starts up the PostgreSQL
! instance, you might add in, after that, something similar to the
! following:
  
+   sleep 10    # To give the database some time to start up
+   $PGBINS/pg_autovacuum -D -s $SBASE -S $SSCALE ... [other arguments]
  
  Command line arguments:
  -----------------------
***************
*** 69,75 ****
  pg_autovacuum has the following optional arguments:
  
  -d debug: 0 silent, 1 basic info, 2 more debug info,  etc...
! -D dameonize: Detach from tty and run in background.
  -s sleep base value: see "Sleeping" below.
  -S sleep scaling factor: see "Sleeping" below.
  -v vacuum base threshold: see Vacuum and Analyze.
--- 100,106 ----
  pg_autovacuum has the following optional arguments:
  
  -d debug: 0 silent, 1 basic info, 2 more debug info,  etc...
! -D daemonize: Detach from tty and run in background.
  -s sleep base value: see "Sleeping" below.
  -S sleep scaling factor: see "Sleeping" below.
  -v vacuum base threshold: see Vacuum and Analyze.
***************
*** 80,97 ****
  -U username: Username pg_autovacuum will use to connect with, if not
     specified the current username is used.
  -P password: Password pg_autovacuum will use to connect with.
! -H host: host name or IP to connect too.
  -p port: port used for connection.
  -h help: list of command line options.
  
! All arguments have default values defined in pg_autovacuum.h.  At the
! time of writing they are:
  
  -d 1
  -v 1000
  -V 2   
! -a 500 (half of -v is not specified)
! -A 1   (half of -v is not specified)
  -s 300 (5 minutes)
  -S 2
  
--- 111,128 ----
  -U username: Username pg_autovacuum will use to connect with, if not
     specified the current username is used.
  -P password: Password pg_autovacuum will use to connect with.
! -H host: host name or IP to connect to.
  -p port: port used for connection.
  -h help: list of command line options.
  
! Numerous arguments have default values defined in pg_autovacuum.h.  At
! the time of writing they are:
  
  -d 1
  -v 1000
  -V 2   
! -a 500 (half of -v if not specified)
! -A 1   (half of -v if not specified)
  -s 300 (5 minutes)
  -S 2
  
***************
*** 99,111 ****
  Vacuum and Analyze:
  -------------------
  
! pg_autovacuum performs either a vacuum analyze or just analyze depending
! on the quantity and type of table activity (insert, update, or delete):
  
  - If the number of (inserts + updates + deletes) > AnalyzeThreshold, then
    only an analyze is performed.
  
! - If the number of (deletes + updates ) > VacuumThreshold, then a
    vacuum analyze is performed.
  
  deleteThreshold is equal to: 
--- 130,143 ----
  Vacuum and Analyze:
  -------------------
  
! pg_autovacuum performs either a VACUUM ANALYZE or just ANALYZE
! depending on the mixture of table activity (insert, update, or
! delete):
  
  - If the number of (inserts + updates + deletes) > AnalyzeThreshold, then
    only an analyze is performed.
  
! - If the number of (deletes + updates) > VacuumThreshold, then a
    vacuum analyze is performed.
  
  deleteThreshold is equal to: 
***************
*** 115,140 ****
      analyze_base_value + (analyze_scaling_factor * "number of tuples in the table")
  
  The AnalyzeThreshold defaults to half of the VacuumThreshold since it
! represents a much less expensive operation (approx 5%-10% of vacuum), and
! running it more often should not substantially degrade system performance.
  
  Sleeping:
  ---------
  
  pg_autovacuum sleeps for a while after it is done checking all the
  databases.  It does this in order to limit the amount of system
! resources it consumes.  This also allows the system administrator to
  configure pg_autovacuum to be more or less aggressive.
  
  Reducing the sleep time will cause pg_autovacuum to respond more
  quickly to changes, whether they be database addition/removal, table
  addition/removal, or just normal table activity.
  
! On the other hand, setting pg_autovaccum to sleep values to agressivly
! (for too short a period of time) can have a negative effect on server
! performance.  If a table gets vacuumed 5 times during the course of a
! large update, this is likely to take much longer than if the table was
! vacuumed only once, at the end.
  
  The total time it sleeps is equal to:
  
--- 147,173 ----
      analyze_base_value + (analyze_scaling_factor * "number of tuples in the table")
  
  The AnalyzeThreshold defaults to half of the VacuumThreshold since it
! represents a much less expensive operation (approx 5%-10% of vacuum),
! and running ANALYZE more often should not substantially degrade system
! performance.
  
  Sleeping:
  ---------
  
  pg_autovacuum sleeps for a while after it is done checking all the
  databases.  It does this in order to limit the amount of system
! resources it consumes.  This allows the system administrator to
  configure pg_autovacuum to be more or less aggressive.
  
  Reducing the sleep time will cause pg_autovacuum to respond more
  quickly to changes, whether they be database addition/removal, table
  addition/removal, or just normal table activity.
  
! On the other hand, setting pg_autovacuum to sleep values too
! aggressively (to too short periods of time) can have a negative effect
! on server performance.  For instance, if a table gets vacuumed 5 times
! during the course of a large set of updates, this is likely to take a
! lot more work than if the table was vacuumed just once, at the end.
  
  The total time it sleeps is equal to:
  
***************
*** 142,156 ****
    loop"
  
  Note that timing measurements are made in seconds; specifying
! "pg_vacuum -s 1" means pg_autovacuum could poll the database upto 60 times
! minute.  In a system with large tables where vacuums may run for several
! minutes, longer times between vacuums are likely to be appropriate.
  
  What pg_autovacuum monitors:
  ----------------------------
  
! pg_autovacuum dynamically generates a list of all databases and tables that
! exist on the server.  It will dynamically add and remove databases and
! tables that are removed from the database server while pg_autovacuum is
! running.  Overhead is fairly small per object.  For example: 10 databases
! with 10 tables each appears to less than 10k of memory on my Linux box.
--- 175,191 ----
    loop"
  
  Note that timing measurements are made in seconds; specifying
! "pg_vacuum -s 1" means pg_autovacuum could poll the database up to 60
! times minute.  In a system with large tables where vacuums may run for
! several minutes, rather longer times between vacuums are likely to be
! appropriate.
  
  What pg_autovacuum monitors:
  ----------------------------
  
! pg_autovacuum dynamically generates a list of all databases and tables
! that exist on the server.  It will dynamically add and remove
! databases and tables that are removed from the database server while
! pg_autovacuum is running.  Overhead is fairly small per object.  For
! example: 10 databases with 10 tables each appears to less than 10k of
! memory on my Linux box.
Index: pg_autovacuum.c
===================================================================
RCS file: /projects/cvsroot/pgsql-server/contrib/pg_autovacuum/pg_autovacuum.c,v
retrieving revision 1.5
diff -c -b -r1.5 pg_autovacuum.c
*** pg_autovacuum.c     11 Sep 2003 19:01:18 -0000      1.5
--- pg_autovacuum.c     12 Sep 2003 22:45:37 -0000
***************
*** 118,123 ****
--- 118,129 ----
        new_tbl->reltuples = atoi(PQgetvalue(res, row, PQfnumber(res, "reltuples")));
        new_tbl->relpages = atoi(PQgetvalue(res, row, PQfnumber(res, "relpages")));
  
+       log_entry(PQgetvalue(res, row, PQfnumber(res, "relisshared")));
+       if (strcmp("t", PQgetvalue(res, row, PQfnumber(res, "relisshared"))))
+         new_tbl->relisshared = 0;
+       else
+         new_tbl->relisshared = 1;
+ 
        new_tbl->analyze_threshold =
                args->analyze_base_threshold + args->analyze_scaling_factor * 
new_tbl->reltuples;
        new_tbl->vacuum_threshold =
***************
*** 213,219 ****
                 * both remove tables from the list that no longer exist and add
                 * tables to the list that are new
                 */
!               res = send_query(query_table_stats(dbi), dbi);
                t = PQntuples(res);
  
                /*
--- 219,225 ----
                 * both remove tables from the list that no longer exist and add
                 * tables to the list that are new
                 */
!               res = send_query((char *) TABLE_STATS_QUERY, dbi);
                t = PQntuples(res);
  
                /*
***************
*** 353,359 ****
  {
        sprintf(logbuffer, "  table name:     %s.%s", tbl->dbi->dbname, 
tbl->table_name);
        log_entry(logbuffer);
!       sprintf(logbuffer, "     relfilenode: %i", tbl->relfilenode);
        log_entry(logbuffer);
        sprintf(logbuffer, "     reltuples: %i;  relpages: %i", tbl->reltuples, 
tbl->relpages);
        log_entry(logbuffer);
--- 359,365 ----
  {
        sprintf(logbuffer, "  table name:     %s.%s", tbl->dbi->dbname, 
tbl->table_name);
        log_entry(logbuffer);
!       sprintf(logbuffer, "     relfilenode: %i;   relisshared: %i", 
tbl->relfilenode, tbl->relisshared);
        log_entry(logbuffer);
        sprintf(logbuffer, "     reltuples: %i;  relpages: %i", tbl->reltuples, 
tbl->relpages);
        log_entry(logbuffer);
***************
*** 688,706 ****
  
  /* End of DB List Management Function */
  
! /* Begninning of misc Functions */
! 
! 
! char *
! query_table_stats(db_info * dbi)
! {
!       if (!strcmp(dbi->dbname, "template1"))          /* Use template1 to
!                                                                                      
          * monitor the system
!                                                                                      
          * tables */
!               return (char *) TABLE_STATS_ALL;
!       else
!               return (char *) TABLE_STATS_USER;
! }
  
  /* Perhaps add some test to this function to make sure that the stats we need are 
available */
  PGconn *
--- 694,700 ----
  
  /* End of DB List Management Function */
  
! /* Beginning of misc Functions */
  
  /* Perhaps add some test to this function to make sure that the stats we need are 
available */
  PGconn *
***************
*** 753,758 ****
--- 747,755 ----
        if (NULL == dbi->conn)
                return NULL;
  
+       if (args->debug >= 4)
+         log_entry(query);
+ 
        res = PQexec(dbi->conn, query);
  
        if (!res)
***************
*** 964,970 ****
        int                     j = 0,
                                loops = 0;
  
! /*    int numInserts, numDeletes, */
        int                     sleep_secs;
        Dllist     *db_list;
        Dlelem     *db_elem,
--- 961,967 ----
    int                 j = 0,
      loops = 0;
  
!   /*  int numInserts, numDeletes, */
    int                 sleep_secs;
    Dllist         *db_list;
    Dlelem         *db_elem,
***************
*** 1055,1061 ****
  
                                if (0 == xid_wraparound_check(dbs));
                                {
!                                       res = send_query(query_table_stats(dbs), dbs); 
         /* Get an updated
                                                                                       
                                                          * snapshot of this dbs
                                                                                       
                                                          * table stats */
                                        for (j = 0; j < PQntuples(res); j++)
--- 1052,1058 ----
  
              if (0 == xid_wraparound_check(dbs));
              {
!               res = send_query(TABLE_STATS_QUERY, dbs);               /* Get an 
updated
                                                                         * snapshot of 
this dbs
                                                                         * table stats 
*/
                for (j = 0; j < PQntuples(res); j++)
***************
*** 1087,1093 ****
                                                                 */
                                                                if 
((tbl->curr_vacuum_count - tbl->CountAtLastVacuum) >= tbl->vacuum_threshold)
                                                                {
!                                                                       snprintf(buf, 
sizeof(buf), "VACUUM ANALYZE \"%s\"", tbl->table_name);
                                                                        if 
(args->debug >= 1)
                                                                        {
                                                                                
sprintf(logbuffer, "Performing: %s", buf);
--- 1084,1094 ----
                             */
                            if ((tbl->curr_vacuum_count - tbl->CountAtLastVacuum) >= 
tbl->vacuum_threshold)
                              {
!                               /* if relisshared = t and database != template1 then 
only do an analyze */
!                               if((tbl->relisshared > 0) && 
(strcmp("template1",dbs->dbname)))
!                                 snprintf(buf, sizeof(buf), "ANALYZE %s", 
tbl->table_name);
!                               else    
!                                 snprintf(buf, sizeof(buf), "VACUUM ANALYZE %s", 
tbl->table_name);
                                if (args->debug >= 1)
                                  {
                                    sprintf(logbuffer, "Performing: %s", buf);
***************
*** 1101,1107 ****
                                                                }
                                                                else if 
((tbl->curr_analyze_count - tbl->CountAtLastAnalyze) >= tbl->analyze_threshold)
                                                                {
!                                                                       snprintf(buf, 
sizeof(buf), "ANALYZE \"%s\"", tbl->table_name);
                                                                        if 
(args->debug >= 1)
                                                                        {
                                                                                
sprintf(logbuffer, "Performing: %s", buf);
--- 1102,1108 ----
                              }
                            else if ((tbl->curr_analyze_count - 
tbl->CountAtLastAnalyze) >= tbl->analyze_threshold)
                              {
!                               snprintf(buf, sizeof(buf), "ANALYZE %s", 
tbl->table_name);
                                if (args->debug >= 1)
                                  {
                                    sprintf(logbuffer, "Performing: %s", buf);
Index: pg_autovacuum.h
===================================================================
RCS file: /projects/cvsroot/pgsql-server/contrib/pg_autovacuum/pg_autovacuum.h,v
retrieving revision 1.6
diff -c -b -r1.6 pg_autovacuum.h
*** pg_autovacuum.h     7 Aug 2003 21:11:57 -0000       1.6
--- pg_autovacuum.h     12 Sep 2003 22:45:37 -0000
***************
*** 34,41 ****
  #define VACUUM_ANALYZE                0
  #define ANALYZE_ONLY          1
  
! #define TABLE_STATS_ALL               "select 
a.relfilenode,a.relname,a.relnamespace,a.relpages,a.reltuples,b.schemaname,b.n_tup_ins,b.n_tup_upd,b.n_tup_del
 from pg_class a, pg_stat_all_tables b where a.relfilenode=b.relid"
! #define TABLE_STATS_USER      "select 
a.relfilenode,a.relname,a.relnamespace,a.relpages,a.reltuples,b.schemaname,b.n_tup_ins,b.n_tup_upd,b.n_tup_del
 from pg_class a, pg_stat_user_tables b where a.relfilenode=b.relid"
  #define FRONTEND
  #define PAGES_QUERY "select relfilenode,reltuples,relpages from pg_class where 
relfilenode=%i"
  #define FROZENOID_QUERY "select oid,age(datfrozenxid) from pg_database where datname 
= 'template1'"
--- 34,41 ----
  #define VACUUM_ANALYZE                0
  #define ANALYZE_ONLY          1
  
! #define TABLE_STATS_QUERY     "select 
a.relfilenode,a.relname,a.relnamespace,a.relpages,a.relisshared,a.reltuples,b.schemaname,b.n_tup_ins,b.n_tup_upd,b.n_tup_del
 from pg_class a, pg_stat_all_tables b where a.relfilenode=b.relid and a.relkind = 'r'"
! 
  #define FRONTEND
  #define PAGES_QUERY "select relfilenode,reltuples,relpages from pg_class where 
relfilenode=%i"
  #define FROZENOID_QUERY "select oid,age(datfrozenxid) from pg_database where datname 
= 'template1'"
***************
*** 86,91 ****
--- 86,92 ----
                           *table_name;
        int                     relfilenode,
                                reltuples,
+                               relisshared,
                                relpages;
        long            analyze_threshold,
                                vacuum_threshold;
***************
*** 132,138 ****
  static PGconn *db_connect(db_info * dbi);
  static void db_disconnect(db_info * dbi);
  static PGresult *send_query(const char *query, db_info * dbi);
- static char *query_table_stats(db_info * dbi);
  
  /* Other Generally needed Functions */
  static void daemonize(void);
--- 133,138 ----

-- 
let name="cbbrowne" and tld="libertyrms.info" in String.concat "@" [name;tld];;
<http://dev6.int.libertyrms.com/>
Christopher Browne
(416) 646 3304 x124 (land)

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Reply via email to