On 26.9.2012 18:29, Tom Lane wrote: > Alvaro Herrera <alvhe...@2ndquadrant.com> writes: >> Excerpts from Euler Taveira's message of miĂŠ sep 26 11:53:27 -0300 2012: >>> On 26-09-2012 09:43, Tomas Vondra wrote: >>>> 5) splitting the single stat file into multiple pieces - e.g. per database, >>>> written separately, so that the autovacuum workers don't need to read all >>>> the data even for databases that don't need to be vacuumed. This might be >>>> combined with (4). > >>> IMHO that's the definitive solution. It would be one file per database plus >>> a >>> global one. That way, the check would only read the global.stat and process >>> those database that were modified. Also, an in-memory map could store that >>> information to speed up the checks. > >> +1 > > That would help for the case of hundreds of databases, but how much > does it help for lots of tables in a single database?
Well, it wouldn't, but it wouldn't make it worse either. Or at least that's how I understand it. > I'm a bit suspicious of the idea that we should encourage people to use > hundreds of databases per installation anyway: the duplicated system > catalogs are going to be mighty expensive, both in disk space and in > their cache footprint in shared buffers. There was some speculation > at the last PGCon about how we might avoid the duplication, but I think > we're years away from any such thing actually happening. You don't need to encourage us to do that ;-) We know it's not perfect and considering a good alternative - e.g. several databases (~10) with schemas inside, replacing the current database-only approach. This way we'd get multiple stat files (thus gaining the benefits) with less overhead (shared catalogs). And yes, using tens of thousands of tables (serving as "caches") for a reporting solution is "interesting" (as in the old Chinese curse) too. > What seems to me like it could help more is fixing things so that the > autovac launcher needn't even launch a child process for databases that > haven't had any updates lately. I'm not sure how to do that, but it > probably involves getting the stats collector to produce some kind of > summary file. Yes, I've proposed something like this in my original mail - setting a "dirty" flag on objects (a database in this case) whenever a table in it gets eligible for vacuum/analyze. Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers