First, thanks for working on this. I hope to be helpful with the design
discussion and possibly some coding if I can find the time.
My initial reaction to this proposal is that it seems overly complex,
however I don't see a more elegant solution. I'm a bit concerned that
most users won't figure out all the knobs.
Alvaro Herrera wrote:
I've been thinking how to improve autovacuum so that we can convince
more people that it can be enabled by default.
I would like to see it enabled by default too, however the reason it
isn't already enabled by default is that it caused failures in the
regression test when we tried to turn it on during the 8.2 dev cycle and
it was too close to beta to fix everything. All this new machinery is
great, but it doesn't address that problem.
Here are my thoughts.
There are two areas of improvements:
1. scheduling, and
2. process handling, i.e., how to have multiple vacuum processes running
at any time.
Fail enough, but I would say the two biggest area for improvement are
scheduling and preventing "HOT" tables from becoming vacuum starved
(essentially what you said, but with a different emphasis).
[snip]
Process Handling
================
My idea here is to morph the current autovacuum daemon from an agent
that itself runs a vacuum command, into something that launches other
processes to run those vacuum commands. I'll call this "the autovacuum
launcher process", or the launcher for short. The idea here is that the
launcher can take care of the scheduling while the worker processes do
their work. If the launcher then determines that a particular instant
there should be two vacuums running, then it simply starts two worker
processes.
How about calling it the autovacuum_master process?
[snip autovacuum launcher process description]
That all sounds reasonable to me. I think the harder part is what you
are getting at below (how to get the launcher to figure out what to
vacuum when).
Scheduling
==========
We introduce the following concepts:
1. table groups. We'll have a system catalog for storing OID and group
name, and another catalog for membership, linking relid to group OID.
pg_av_tablegroup
tgrname name
pg_av_tgroupmembers
groupid oid
relid oid
>
2. interval groups. We'll have a catalog for storing igroup name and
OID, and another catalog for membership. We identify an interval by:
- month of year
- day of month
- day of week
- start time of day
- end time of day
This is modelled after crontabs.
pg_av_intervalgroup
igrname name
pg_av_igroupmembers
groupid oid
month int
dom int
dow int
starttime timetz
endtime timetz
This seems to assume that the start and end time for an interval will be
on the same day, you probably need to specify a start month, dom, dow,
time and an end month, dom, dow and time.
Since this is modeled after cron, do we allow wild-cards, or any of the
other cron tricks like */20 or 1-3,5,7,9-11?
Also your notation above is ambiguous, it took me a while to realize
that pg_av_igroupmembers.groupid wasn't referencing the id from
pg_av_tablegroup.
Additionally, we'll have another catalog on which we'll store table
groups to interval groups relationships. On that catalog we'll also
store those autovacuum settings that we want to be able to override:
whether to disable it for this interval group, or the values for the
vacuum/analyze equations.
pg_av_schedule
tgroup oid
igroup oid
enabled bool
queue int
vac_base_thresh int
vac_scale_factor float
anl_base_thresh int
anl_scal_factor float
vac_cost_delay int
vac_cost_limit int
freeze_min_age int
freeze_max_age int
What is queue for?
So the scheduler, at startup, loads the whole schedule in memory, and
then wakes up at reasonable intervals and checks whether these equations
hold for some of the tables it's monitoring. If they do, then launch a
new worker process to do the job.
We need a mechanism for having the scheduler rescan the schedule when a
user modifies the catalog -- maybe having a trigger that sends a signal
to the process is good enough (implementation detail: the signal must be
routed via the postmaster, since the backend cannot hope to know the
scheduler's PID. This is easy enough to do.)
This all looks reasonable if not a bit complex. Question, what happens
to the current pg_autovacuum relation?
Also what about system defaults, will we have a hard coded default
interval of always on, and one default table group that contains all the
tables with one default entry in pg_av_schedule?
I think we need more discussion on scheduling, we need to make sure this
solves the vacuum starvation problem. Does the launcher process
consider each row in pg_av_schedule that applies at the current time
separately? That is say there are three entries in pg_av_schedule that
apply right now, does that mean that the launcher can fire off three
different vacuums? Perhaps we need to add a column to pg_av_tablegroup
that specifies the max number of concurrent worker processes for this
table group.
Also, I don't think we need the concept of queues as described in recent
threads. I think the idea of the queues was the the system would be
able to automatically find small tables and vacuum them frequently, in
this proposal the admin would have to create a group for small tables
and manually add tables to the group and make sure that there are enough
worker processes for that group to prevent vacuum starvation. Perhaps
we can create a dynamic group that includes all tables with less than a
certain number of rows or blocks?
Thanks for working on this!
Matt O'Connor
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly