Steve Crawford wrote:

On Monday 21 March 2005 11:40 am, Tom Lane wrote:


"Matthew T. O'Connor" <matthew@zeut.net> writes:


I believe this discrepancy has to do with the fact that ANALYZE
can return some very bogus values for reltuples, where as vacuum
always returns an accurate count. I'm not sure how to best
handle this.


I think 8.0's ANALYZE will do a better estimation job ... at least,
Manfred Koizar rewrote the sampling algorithm in hopes of making it
more robust.

However, given that there are 9334 tuples in 82282 pages, I'd say
that autovacuum has already failed Steve rather badly :-(. There
shouldn't be more than a couple hundred pages given that number of
rows. Perhaps the FSM settings are too small?



Results time. FSM settings were too small but the real problem seems to be that pg_autovacuum isn't getting the job done.



Yeah, it certainly looks that way...

I ran VACUUM VERBOSE and set postgresql.conf as follows (roughly 2.5 times the amounts suggested by VACUUM VERBOSE results):
max_fsm_pages = 500000
max_fsm_relations = 2500


I restarted the server last Friday night and manually did a vacuum full and reindex of each user and system table in every database on Monday. The zzz.pg_attribute table's file size dropped from over 600MB to less than 2MB.



Wow, that is some serious bloat.

It's now three days later and that table has already increased to 29MB. Processes accessing this database do create many temporary tables so the exact count in pg_attribute varies a bit from time to time but always hovers around 9500. I just did a manual VACUUM FULL on pg_attribute and it's back to 1.3MB.



Depending on several factors the "steady state" size of pg_attribute may be several times larger than it's size right after a vacuum full. The problem is not that it may be 29M, but rather that it continues to grow.


Upon completion of the vacuum, I restarted the pg_autovacuum daemon. Following are the autovacuum log entries related to zzz.pg_attribute. Note the growth of reltuples from 9532.000000 to 184720.000000 in that time (what is a millionth of a tuple, anyway?) and the lack of any vacuums performed.



I assume that pg_attribute didn't actually grow to 184720 tuples?

28 12:12 PM]   table name: zzz."pg_catalog"."pg_attribute"
28 12:12 PM]      relid: 1249;   relisshared: 0
28 12:12 PM]      reltuples: 9532.000000;  relpages: 157
28 12:12 PM]      curr_analyze_count: 176294; curr_vacuum_count: 15447
28 12:12 PM]      last_analyze_count: 176294; last_vacuum_count: 15447
28 12:12 PM]      analyze_threshold: 10032; vacuum_threshold: 20064
28 12:12 PM] added table: zzz."pg_catalog"."pg_attribute"
--
28 03:42 PM] Performing: ANALYZE "pg_catalog"."pg_attribute"
28 03:42 PM]   table name: zzz."pg_catalog"."pg_attribute"
28 03:42 PM]      relid: 1249;   relisshared: 0
28 03:42 PM]      reltuples: 20270.000000;  relpages: 336
28 03:42 PM]      curr_analyze_count: 186374; curr_vacuum_count: 16329
28 03:42 PM]      last_analyze_count: 186374; last_vacuum_count: 15447
28 03:42 PM]      analyze_threshold: 20770; vacuum_threshold: 20064
--
28 10:59 PM] Performing: ANALYZE "pg_catalog"."pg_attribute"
28 10:59 PM]   table name: zzz."pg_catalog"."pg_attribute"
28 10:59 PM]      relid: 1249;   relisshared: 0
28 10:59 PM]      reltuples: 42591.000000;  relpages: 706
28 10:59 PM]      curr_analyze_count: 207254; curr_vacuum_count: 18156
28 10:59 PM]      last_analyze_count: 207254; last_vacuum_count: 15447
28 10:59 PM]      analyze_threshold: 43091; vacuum_threshold: 20064
--
29 02:03 PM] Performing: ANALYZE "pg_catalog"."pg_attribute"
29 02:03 PM]   table name: zzz."pg_catalog"."pg_attribute"
29 02:03 PM]      relid: 1249;   relisshared: 0
29 02:03 PM]      reltuples: 89464.000000;  relpages: 1483
29 02:03 PM]      curr_analyze_count: 250664; curr_vacuum_count: 21999
29 02:03 PM]      last_analyze_count: 250664; last_vacuum_count: 15447
29 02:03 PM]      analyze_threshold: 89964; vacuum_threshold: 20064
--
30 09:20 PM] Performing: ANALYZE "pg_catalog"."pg_attribute"
30 09:20 PM]   table name: zzz."pg_catalog"."pg_attribute"
30 09:20 PM]      relid: 1249;   relisshared: 0
30 09:20 PM]      reltuples: 184720.000000;  relpages: 3062
30 09:20 PM]      curr_analyze_count: 340791; curr_vacuum_count: 29886
30 09:20 PM]      last_analyze_count: 340791; last_vacuum_count: 15447
30 09:20 PM]      analyze_threshold: 185220; vacuum_threshold: 20064

Should I abandon pg_autovacuum and just do periodic VACUUM ANALYZEs of everything? Should I use settings other than the defaults for pg_autovacuum? Are temporary tables evil and their use to be avoided?

Just checked and zzz.pg_attribute is 50% larger than it was when I did the VACUUM FULL at the start of this email.


I don't think you should you abandon pg_autovacuum, but I do think you should run periodic vacuum commands from cron until this is resolved.


The 1st thing you should do it change the pg_autovacuum settings. The default values are very (far too) conservative for a lot of people. Most people seem to have success with settings like:
pg_autovacuum -v 300 -V 0.1 -a 200 -A 0.1
Give that a try and see if it helps.


From pg_autovacuum's prospective, from 3/28 12:12 PM to 3/30 09:20 PM there have been 29886 - 15447 = 14439 deletes + update commands against the pg_attribute table. Once it sees 20064 deletes + updates it will perform a vacuum. If there have been more during that time, we need to figure out why pg_autovacuum is not seeing them.

The thing I don't understand in your numbers is why reltuples is constantly increasing. When pg_autovacuum is reporting a reltuples of 184720, how many tuples does pg_attribute really have? This might be related to the new reltuples estimation code that Tom was talking about recently, if so, 8.0.2 might help.

Anyone else have some insight here?

Matt


---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Reply via email to