Hi hackers,

Attached is sketch of small patch that fixes several edge cases with
autovacuum. Long story short autovacuum never comes to append only tables,
killing large productions.

First case, mine.

https://www.postgresql.org/message-id/CAC8Q8tLBeAxR%2BBXWuKK%2BHP5m8tEVYn270CVrDvKXt%3D0PkJTY9g%40mail.gmail.com

We had a table we were appending and wanted Index Only Scan to work. For it
to work, you need to call VACUUM manually, since VACUUM is the only way to
mark pages all visible, and autovacuum never comes to append only tables.
We were clever to invent a workflow without dead tuples and it painfully
bit us.

Second case, just read in the news.
https://mailchimp.com/what-we-learned-from-the-recent-mandrill-outage/

Mandrill has 6TB append only table that autovacuum probably never vacuumed.
Then anti-wraparound came and production went down. If autovacuum did its
job before that last moment, it would probably be okay.

Idea: look not on dead tuples, but on changes, just like ANALYZE does.
It's my first patch on Postgres, it's probably all wrong but I hope it
helps you get the idea.
-- 
Darafei Praliaskouski
Support me: http://patreon.com/komzpa
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index fa875db816..e297fc8c4b 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3055,7 +3055,7 @@ relation_needs_vacanalyze(Oid relid,
 	if (PointerIsValid(tabentry) && AutoVacuumingActive())
 	{
 		reltuples = classForm->reltuples;
-		vactuples = tabentry->n_dead_tuples;
+		vactuples = tabentry->changes_since_vacuum;
 		anltuples = tabentry->changes_since_analyze;
 
 		vacthresh = (float4) vac_base_thresh + vac_scale_factor * reltuples;
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 2a8472b91a..cd4611d64f 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -4687,6 +4687,7 @@ pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry, Oid tableoid, bool create)
 		result->n_live_tuples = 0;
 		result->n_dead_tuples = 0;
 		result->changes_since_analyze = 0;
+		result->changes_since_vacuum = 0;
 		result->blocks_fetched = 0;
 		result->blocks_hit = 0;
 		result->vacuum_timestamp = 0;
@@ -5817,6 +5818,7 @@ pgstat_recv_tabstat(PgStat_MsgTabstat *msg, int len)
 			tabentry->n_live_tuples = tabmsg->t_counts.t_delta_live_tuples;
 			tabentry->n_dead_tuples = tabmsg->t_counts.t_delta_dead_tuples;
 			tabentry->changes_since_analyze = tabmsg->t_counts.t_changed_tuples;
+			tabentry->changes_since_vacuum = tabmsg->t_counts.t_changed_tuples;
 			tabentry->blocks_fetched = tabmsg->t_counts.t_blocks_fetched;
 			tabentry->blocks_hit = tabmsg->t_counts.t_blocks_hit;
 
@@ -5850,6 +5852,7 @@ pgstat_recv_tabstat(PgStat_MsgTabstat *msg, int len)
 			tabentry->n_live_tuples += tabmsg->t_counts.t_delta_live_tuples;
 			tabentry->n_dead_tuples += tabmsg->t_counts.t_delta_dead_tuples;
 			tabentry->changes_since_analyze += tabmsg->t_counts.t_changed_tuples;
+			tabentry->changes_since_vacuum += tabmsg->t_counts.t_changed_tuples;
 			tabentry->blocks_fetched += tabmsg->t_counts.t_blocks_fetched;
 			tabentry->blocks_hit += tabmsg->t_counts.t_blocks_hit;
 		}
@@ -6083,6 +6086,7 @@ pgstat_recv_vacuum(PgStat_MsgVacuum *msg, int len)
 
 	tabentry->n_live_tuples = msg->m_live_tuples;
 	tabentry->n_dead_tuples = msg->m_dead_tuples;
+	tabentry->changes_since_vacuum = 0;
 
 	if (msg->m_autovacuum)
 	{
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index c080fa6388..65e38edd8d 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -643,6 +643,7 @@ typedef struct PgStat_StatTabEntry
 	PgStat_Counter n_live_tuples;
 	PgStat_Counter n_dead_tuples;
 	PgStat_Counter changes_since_analyze;
+	PgStat_Counter changes_since_vacuum;
 
 	PgStat_Counter blocks_fetched;
 	PgStat_Counter blocks_hit;

Reply via email to