Hi Tom,

Thanks for the suggestion. It's been a while since I replied to this as I 
had to go and do some further investigation of the docs with regards the 
autovacuum daemons configuration. According to the documentation, the 
formula's for the vacuum and analyze are as follows:

Vacuum
    vacuum threshold = vacuum base threshold + vacuum scale factor * number 
of tuples
Analyze
    analyze threshold = analyze base threshold + analyze scale factor * 
number of tuples

My current settings for autovacuum are as follows:

# - Cost-Based Vacuum Delay -

vacuum_cost_delay         = 200                 # 0-1000 milliseconds
vacuum_cost_page_hit    = 1                     # 0-10000 credits
vacuum_cost_page_miss = 10                   # 0-10000 credits
vacuum_cost_page_dirty = 20                   # 0-10000 credits
vacuum_cost_limit           = 200                 # 0-10000 credits

#---------------------------------------------------------------------------
# AUTOVACUUM PARAMETERS
#---------------------------------------------------------------------------

autovacuum                                   = on                      # 
enable autovacuum subprocess?
                                                                             
      # 'on' requires stats_start_collector
                                                                             
      # and stats_row_level to also be on
autovacuum_naptime                     = 1min                   # time 
between autovacuum runs
autovacuum_vacuum_threshold     = 500                     # min # of tuple 
updates before
                                                                             
      # vacuum
autovacuum_analyze_threshold      = 250                    # min # of tuple 
updates before
                                                                             
      # analyze
autovacuum_vacuum_scale_factor = 0.2                     # fraction of rel 
size before
                                                                             
      # vacuum
autovacuum_analyze_scale_factor = 0.1                     # fraction of rel 
size before
                                                                             
      # analyze
autovacuum_freeze_max_age       = 200000000         # maximum XID age before 
forced vacuum
                                                                             
      # (change requires restart)
autovacuum_vacuum_cost_delay  = -1                       # default vacuum 
cost delay for
                                                                             
      # autovacuum, -1 means use
                                                                             
      # vacuum_cost_delay
autovacuum_vacuum_cost_limit    = -1                       # default vacuum 
cost limit for
                                                                             
     # autovacuum, -1 means use
                                                                             
     # vacuum_cost_limit

Thus to make the autovacuum more aggressive I am thinking along the lines of 
changing the following parameters:

autovacuum_vacuum_threshold     = 250
autovacuum_analyze_threshold     = 125

The documentation also mentions that when the autovacuum runs it selects a 
single database to process on that run. This means that the particular table 
that we are interrested in will only be vacuumed once every 17 minutes, 
assuming we have 18 databases and the selection process is sequential 
through the database list.

>From my understanding of the documentation, the only way to work around this 
issue is to manually update the system catalog table pg_autovacuum and set 
the pg_autovacuum.enabled field to false to skip the autovacuum on tables 
that dont require such frequent vacuums. If I do enable this feature, and I 
manually run a vacuumdb from the command line against that particular 
disabled table, will the vacuum still process the table? I'm assuming too, 
that the best tables to disable autovacuum on will be ones with a minimal 
amount of update/delete queries run against it. For example, if we have a 
table that only has inserts applied to it, it is safe to assume that that 
table can safely be ignored by autovacuum.

Do you have any other suggestions as to which tables generally can be 
excluded from the autovacuum based on the usage patterns?
Can you see anything with respect to my new autovacuum parameters that may 
cause issue's and are there any other parameters that you suggest I need to 
change to make the autovacuum daemon more aggressive?

PS: Currently we have the Cluster command running on the sipaccounts table 
as the vacuum full is taking too long. It would be nice though to have some 
piece of mind that the cluster command is mvcc safe, as Heikki and Aidan 
have mentioned that it is not and may break things in our environment, I'm a 
little afraid of running with the cluster command, and should possibly go 
back to the vacuum full :/

Thanks all for any and all suggestions/comments.

Thanks
Bruce


"Tom Lane" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> "Bruce McAlister" <[EMAIL PROTECTED]> writes:
>> [1] AutoVacuum runs during the day over the entire PostgreSQL cluster,
>
> Good, but evidently you need to make it more aggressive.
>
>> [2] A Vacuum Full Verbose is run during our least busy period (generally
>> 03:30) against the Database,
>
>> [3] A Re-Index on the table is performed,
>
>> [4] A Cluster on the table is performed against the most used index,
>
>> [5] A Vacuum Analyze Verbose is run against the database.
>
> That is enormous overkill.  Steps 2 and 3 are a 100% waste of time if
> you are going to cluster in step 4.  Just do the CLUSTER and then
> ANALYZE (or VACUUM ANALYZE if you really must, but the value is marginal).
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>               http://www.postgresql.org/docs/faq
> 



---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

Reply via email to