Tom Lane wrote:
Alvaro Herrera <[EMAIL PROTECTED]> writes:
Matthew T. O'Connor wrote:
I'm not sure it's a good idea to tie this to the vacuum cost delay settings either, so let me as you this, how is this better than just allowing the admin to set a new GUC variable like autovacuum_hot_table_size_threshold (or something shorter) which we can assign a decent default of say 8MB.

Yeah, maybe that's better -- it's certainly simpler.

I'm not liking any of these very much, as they seem critically dependent
on impossible-to-tune parameters.  I think it'd be better to design this
around having the first worker explicitly expose its state (list of
tables to process, in order) and having subsequent workers key off that
info.  The shared memory state could include the OID of the table each
worker is currently working on, and we could keep the to-do list in some
simple flat file for instance (since we don't care about crash safety).

So far we are only talking about one parameter, the hot_table_size_threshold, which I agree would be a guess by an admin, but if we went in this direction, I would also advocate adding a column to the pg_autovacuum table that allows an admin to explicitly define a table as hot or not.

Also I think each worker should be mostly independent, the only caveat being that (assuming each worker works in size order) if we catch up to an older worker (get to the table they are currently working on) we exit. Personally I think this is all we need, but others felt the additional threshold was needed. What do you think? Or what do you think might be better?

I'm not certain exactly what "key off" needs to mean; perhaps each
worker should make its own to-do list and then discard items that are
either in-progress or recently done by another worker when it gets to
them.

My initial design didn't have any threshold at all, but others felt this would/could result in too many worker working concurrently in the same DB.

I think an absolute minimum requirement for a sane design is that no two
workers ever try to vacuum the same table concurrently, and I don't see
where that behavior will emerge from your proposal; whereas it's fairly
easy to make it happen if non-first workers pay attention to what other
workers are doing.

Maybe we never made that clear, I was always working on the assumption that two workers would never try to work on the same table at the same time.

BTW, it's probably necessary to treat shared catalogs specially ...

Certainly.

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Reply via email to