Hello Russell,

Russell Smith wrote:

I am doing serious thinking about the implementation of Auto Vacuum as part of 
the backend, Not using libpq, but classing internal functions directly.
It appears to me that calling internal functions directly is a better 
implementation than using the external library to do the job.



We are planning to move it into the backend (no longer an external libpq based contrib module) I tried to do this for 8.0, but it didn't make the cut, so I expect this work will be done for 8.1.

I know I might be stepping on Matthew's toes, but I don't really want to.  I am 
a complete newbie to the postgresql code, however I am trying.
Vacuum appears to be one of the bigger saw points with administrator having to 
configure it via scheduled tasks.



Agreed, that is one of the reasons I took on Autovacuum, I think it is something a lot of admins would like PG to do for itself.

The major autovacuum issues

1. Transaction Wraparound
2. Vacuum of relations
3. Tracking of when to do vacuums
4. Where to store information needed by auto vacuum

1. Transaction Wraparound



This is handled by the current autovacuum using the process outlined in:
http://www.postgresql.org/docs/7.4/static/maintenance.html

2. Vacuuming of relations

Currently, the entire heap must be vacuumed at one time. I would possible be desireable to have only part of the relation vacuumed at
a time. If you can find out which parts of the relation have the most slack space. There is a todo item regarding tracking recent deletions
so they can be resused. Some form of this would be helpful to work out what to vacuum. Performance issues for this type of activity may be a concern. But I have no experience to be able to make comment on them. So I welcome yours.




This is not really an autovacuum related topic, if at some point someone adds the ability to VACUUM to do partials then autovacuum will make use of it. BTW, this has been suggested several times so please search the archives for details.

3. Tracking of when to vacuum

Current autovacuum relies the stats collector to be running. I would like to only use the stats if they are available,
and have an option to be able to vacuum accurately without having to have stats running.


I think it is universally agreed upon that using data from the FSM is a better solution since it would not require you to have the stats system running and actually gives you a very accurate picture of what table have slack space to recover (assuming that the FSM is large enough). This is a topic that I need help on from some more enlightened core hackers.

4. Where to store information required by auto vacuum.


The backend integration patch that I submitted a few months ago added a new pg_autovacuum table to the system catalogues. This table stored data that pg_autovacuum needed to persist across backend restarts, and also allowed the user to set per table settings for thresholds etc. I never heard anyone complain about this design, so from the silence I assume this is an acceptable way of maintaining pg_autovacuum related data.

Matthew

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
     subscribe-nomail command to [EMAIL PROTECTED] so that your
     message can get through to the mailing list cleanly

Reply via email to