Hi hackers, Attached is sketch of small patch that fixes several edge cases with autovacuum. Long story short autovacuum never comes to append only tables, killing large productions.
First case, mine. https://www.postgresql.org/message-id/CAC8Q8tLBeAxR%2BBXWuKK%2BHP5m8tEVYn270CVrDvKXt%3D0PkJTY9g%40mail.gmail.com We had a table we were appending and wanted Index Only Scan to work. For it to work, you need to call VACUUM manually, since VACUUM is the only way to mark pages all visible, and autovacuum never comes to append only tables. We were clever to invent a workflow without dead tuples and it painfully bit us. Second case, just read in the news. https://mailchimp.com/what-we-learned-from-the-recent-mandrill-outage/ Mandrill has 6TB append only table that autovacuum probably never vacuumed. Then anti-wraparound came and production went down. If autovacuum did its job before that last moment, it would probably be okay. Idea: look not on dead tuples, but on changes, just like ANALYZE does. It's my first patch on Postgres, it's probably all wrong but I hope it helps you get the idea. -- Darafei Praliaskouski Support me: http://patreon.com/komzpa
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c index fa875db816..e297fc8c4b 100644 --- a/src/backend/postmaster/autovacuum.c +++ b/src/backend/postmaster/autovacuum.c @@ -3055,7 +3055,7 @@ relation_needs_vacanalyze(Oid relid, if (PointerIsValid(tabentry) && AutoVacuumingActive()) { reltuples = classForm->reltuples; - vactuples = tabentry->n_dead_tuples; + vactuples = tabentry->changes_since_vacuum; anltuples = tabentry->changes_since_analyze; vacthresh = (float4) vac_base_thresh + vac_scale_factor * reltuples; diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c index 2a8472b91a..cd4611d64f 100644 --- a/src/backend/postmaster/pgstat.c +++ b/src/backend/postmaster/pgstat.c @@ -4687,6 +4687,7 @@ pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry, Oid tableoid, bool create) result->n_live_tuples = 0; result->n_dead_tuples = 0; result->changes_since_analyze = 0; + result->changes_since_vacuum = 0; result->blocks_fetched = 0; result->blocks_hit = 0; result->vacuum_timestamp = 0; @@ -5817,6 +5818,7 @@ pgstat_recv_tabstat(PgStat_MsgTabstat *msg, int len) tabentry->n_live_tuples = tabmsg->t_counts.t_delta_live_tuples; tabentry->n_dead_tuples = tabmsg->t_counts.t_delta_dead_tuples; tabentry->changes_since_analyze = tabmsg->t_counts.t_changed_tuples; + tabentry->changes_since_vacuum = tabmsg->t_counts.t_changed_tuples; tabentry->blocks_fetched = tabmsg->t_counts.t_blocks_fetched; tabentry->blocks_hit = tabmsg->t_counts.t_blocks_hit; @@ -5850,6 +5852,7 @@ pgstat_recv_tabstat(PgStat_MsgTabstat *msg, int len) tabentry->n_live_tuples += tabmsg->t_counts.t_delta_live_tuples; tabentry->n_dead_tuples += tabmsg->t_counts.t_delta_dead_tuples; tabentry->changes_since_analyze += tabmsg->t_counts.t_changed_tuples; + tabentry->changes_since_vacuum += tabmsg->t_counts.t_changed_tuples; tabentry->blocks_fetched += tabmsg->t_counts.t_blocks_fetched; tabentry->blocks_hit += tabmsg->t_counts.t_blocks_hit; } @@ -6083,6 +6086,7 @@ pgstat_recv_vacuum(PgStat_MsgVacuum *msg, int len) tabentry->n_live_tuples = msg->m_live_tuples; tabentry->n_dead_tuples = msg->m_dead_tuples; + tabentry->changes_since_vacuum = 0; if (msg->m_autovacuum) { diff --git a/src/include/pgstat.h b/src/include/pgstat.h index c080fa6388..65e38edd8d 100644 --- a/src/include/pgstat.h +++ b/src/include/pgstat.h @@ -643,6 +643,7 @@ typedef struct PgStat_StatTabEntry PgStat_Counter n_live_tuples; PgStat_Counter n_dead_tuples; PgStat_Counter changes_since_analyze; + PgStat_Counter changes_since_vacuum; PgStat_Counter blocks_fetched; PgStat_Counter blocks_hit;