Re: [HACKERS] [PATCH] Statistics collection for CLUSTER command
On Sun, Oct 20, 2013 at 1:37 AM, Noah Misch wrote: >> > (2013/08/08 20:52), Vik Fearing wrote: >> >> As part of routine maintenance monitoring, it is interesting for us to >> >> have statistics on the CLUSTER command (timestamp of last run, and >> >> number of runs since stat reset) like we have for (auto)ANALYZE and >> >> (auto)VACUUM. Patch against today's HEAD attached. > > Adding new fields to PgStat_StatTabEntry imposes a substantial distributed > cost, because every database stats file write-out grows by the width of those > fields times the number of tables in the database. Associated costs have been > and continue to be a pain point with large table counts: > > http://www.postgresql.org/message-id/flat/1718942738eb65c8407fcd864883f...@fuzzy.cz > http://www.postgresql.org/message-id/flat/52268887.9010...@uptime.jp > > In that light, I can't justify widening PgStat_StatTabEntry by 9.5% for this. > I recommend satisfying this monitoring need in your application by creating a > cluster_table wrapper function that issues CLUSTER and then updates statistics > you store in an ordinary table. Issue all routine CLUSTERs by way of that > wrapper function. A backend change that would help here is to extend event > triggers to cover the CLUSTER command, permitting you to inject monitoring > after plain CLUSTER and dispense with the wrapper. I unfortunately have to agree with this, but I think it points to the need for further work on the pgstat infrastructure. We used to have one file; now we have one per database. That's better for people with lots of databases, but many people just have one big database. We need a solution here that relieves the pain for those people. (I can't help thinking that the root of the problem here is that we're rewriting the whole file, and that any solution that doesn't somehow facilitate updates of individual records will be only a small improvement.) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Statistics collection for CLUSTER command
Noah Misch writes: > wrapper function. A backend change that would help here is to extend event > triggers to cover the CLUSTER command, permitting you to inject monitoring > after plain CLUSTER and dispense with the wrapper. I didn't look in any level of details, but it might be as simple as moving the T_ClusterStmt case from standard_ProcessUtility() down to the Event Trigger friendly part known as ProcessUtilitySlow(). Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Statistics collection for CLUSTER command
> > (2013/08/08 20:52), Vik Fearing wrote: > >> As part of routine maintenance monitoring, it is interesting for us to > >> have statistics on the CLUSTER command (timestamp of last run, and > >> number of runs since stat reset) like we have for (auto)ANALYZE and > >> (auto)VACUUM. Patch against today's HEAD attached. Adding new fields to PgStat_StatTabEntry imposes a substantial distributed cost, because every database stats file write-out grows by the width of those fields times the number of tables in the database. Associated costs have been and continue to be a pain point with large table counts: http://www.postgresql.org/message-id/flat/1718942738eb65c8407fcd864883f...@fuzzy.cz http://www.postgresql.org/message-id/flat/52268887.9010...@uptime.jp In that light, I can't justify widening PgStat_StatTabEntry by 9.5% for this. I recommend satisfying this monitoring need in your application by creating a cluster_table wrapper function that issues CLUSTER and then updates statistics you store in an ordinary table. Issue all routine CLUSTERs by way of that wrapper function. A backend change that would help here is to extend event triggers to cover the CLUSTER command, permitting you to inject monitoring after plain CLUSTER and dispense with the wrapper. Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Statistics collection for CLUSTER command
On 09/16/2013 08:26 AM, Satoshi Nagayasu wrote: > (2013/08/08 20:52), Vik Fearing wrote: >> As part of routine maintenance monitoring, it is interesting for us to >> have statistics on the CLUSTER command (timestamp of last run, and >> number of runs since stat reset) like we have for (auto)ANALYZE and >> (auto)VACUUM. Patch against today's HEAD attached. >> >> I would add this to the next commitfest but I seem to be unable to log >> in with my community account (I can log in to the wiki). Help >> appreciated. > > I have reviewed the patch. Thank you for your review. > Succeeded to build with the latest HEAD, and passed the regression > tests. > > Looks good enough, and I'd like to add a test case here, not only > for the view definition, but also working correctly. > > Please take a look at attached one. Looks good to me. Attached is a rebased patch with those tests added. -- Vik *** a/doc/src/sgml/monitoring.sgml --- b/doc/src/sgml/monitoring.sgml *** *** 979,984 postgres: user database host + last_cluster + timestamp with time zone + Last time at which CLUSTER was issued on this table + + vacuum_count bigint Number of times this table has been manually vacuumed *** *** 1001,1006 postgres: user database host Number of times this table has been analyzed by the autovacuum daemon + + cluster_count + bigint + Number of times CLUSTER has been issued on this table + *** a/src/backend/catalog/system_views.sql --- b/src/backend/catalog/system_views.sql *** *** 410,419 CREATE VIEW pg_stat_all_tables AS pg_stat_get_last_autovacuum_time(C.oid) as last_autovacuum, pg_stat_get_last_analyze_time(C.oid) as last_analyze, pg_stat_get_last_autoanalyze_time(C.oid) as last_autoanalyze, pg_stat_get_vacuum_count(C.oid) AS vacuum_count, pg_stat_get_autovacuum_count(C.oid) AS autovacuum_count, pg_stat_get_analyze_count(C.oid) AS analyze_count, ! pg_stat_get_autoanalyze_count(C.oid) AS autoanalyze_count FROM pg_class C LEFT JOIN pg_index I ON C.oid = I.indrelid LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace) --- 410,421 pg_stat_get_last_autovacuum_time(C.oid) as last_autovacuum, pg_stat_get_last_analyze_time(C.oid) as last_analyze, pg_stat_get_last_autoanalyze_time(C.oid) as last_autoanalyze, + pg_stat_get_last_cluster_time(C.oid) as last_cluster, pg_stat_get_vacuum_count(C.oid) AS vacuum_count, pg_stat_get_autovacuum_count(C.oid) AS autovacuum_count, pg_stat_get_analyze_count(C.oid) AS analyze_count, ! pg_stat_get_autoanalyze_count(C.oid) AS autoanalyze_count, ! pg_stat_get_cluster_count(C.oid) AS cluster_count FROM pg_class C LEFT JOIN pg_index I ON C.oid = I.indrelid LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace) *** a/src/backend/commands/cluster.c --- b/src/backend/commands/cluster.c *** *** 35,40 --- 35,41 #include "commands/vacuum.h" #include "miscadmin.h" #include "optimizer/planner.h" + #include "pgstat.h" #include "storage/bufmgr.h" #include "storage/lmgr.h" #include "storage/predicate.h" *** *** 407,412 cluster_rel(Oid tableOid, Oid indexOid, bool recheck, bool verbose, --- 408,417 verbose); /* NB: rebuild_relation does heap_close() on OldHeap */ + + /* Report CLUSTER to the stats collector, but not VACUUM FULL */ + if (indexOid != InvalidOid) + pgstat_report_cluster(OldHeap); } /* *** a/src/backend/postmaster/pgstat.c --- b/src/backend/postmaster/pgstat.c *** *** 292,297 static void pgstat_recv_resetsinglecounter(PgStat_MsgResetsinglecounter *msg, in --- 292,298 static void pgstat_recv_autovac(PgStat_MsgAutovacStart *msg, int len); static void pgstat_recv_vacuum(PgStat_MsgVacuum *msg, int len); static void pgstat_recv_analyze(PgStat_MsgAnalyze *msg, int len); + static void pgstat_recv_cluster(PgStat_MsgCluster *msg, int len); static void pgstat_recv_bgwriter(PgStat_MsgBgWriter *msg, int len); static void pgstat_recv_funcstat(PgStat_MsgFuncstat *msg, int len); static void pgstat_recv_funcpurge(PgStat_MsgFuncpurge *msg, int len); *** *** 1385,1390 pgstat_report_analyze(Relation rel, --- 1386,1412 } /* + * pgstat_report_cluster() - + * + * Tell the collector about the table we just CLUSTERed. + * + */ + void + pgstat_report_cluster(Relation rel) + { + PgStat_MsgCluster msg; + + if (pgStatSock == PGINVALID_SOCKET) + return; + + pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_CLUSTER); + msg.m_databaseid = rel->rd_rel->relisshared ? InvalidOid : MyDatabaseId; + msg.m_tableoid = RelationGetRelid(re
Re: [HACKERS] [PATCH] Statistics collection for CLUSTER command
(2013/08/08 20:52), Vik Fearing wrote: As part of routine maintenance monitoring, it is interesting for us to have statistics on the CLUSTER command (timestamp of last run, and number of runs since stat reset) like we have for (auto)ANALYZE and (auto)VACUUM. Patch against today's HEAD attached. I would add this to the next commitfest but I seem to be unable to log in with my community account (I can log in to the wiki). Help appreciated. I have reviewed the patch. Succeeded to build with the latest HEAD, and passed the regression tests. Looks good enough, and I'd like to add a test case here, not only for the view definition, but also working correctly. Please take a look at attached one. Regards, -- Satoshi Nagayasu Uptime Technologies, LLC. http://www.uptime.jp diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out index 56bace1..ba614b2 100644 --- a/src/test/regress/expected/stats.out +++ b/src/test/regress/expected/stats.out @@ -28,7 +28,13 @@ SELECT pg_sleep(2.0); CREATE TEMP TABLE prevstats AS SELECT t.seq_scan, t.seq_tup_read, t.idx_scan, t.idx_tup_fetch, (b.heap_blks_read + b.heap_blks_hit) AS heap_blks, - (b.idx_blks_read + b.idx_blks_hit) AS idx_blks + (b.idx_blks_read + b.idx_blks_hit) AS idx_blks, + coalesce(t.last_vacuum, now()) AS last_vacuum, + coalesce(t.last_analyze, now()) AS last_analyze, + coalesce(t.last_cluster, now()) AS last_cluster, + t.vacuum_count, + t.analyze_count, + t.cluster_count FROM pg_catalog.pg_stat_user_tables AS t, pg_catalog.pg_statio_user_tables AS b WHERE t.relname='tenk2' AND b.relname='tenk2'; @@ -111,4 +117,27 @@ SELECT st.heap_blks_read + st.heap_blks_hit >= pr.heap_blks + cl.relpages, t| t (1 row) +-- table maintenance stats +ANALYZE tenk2; +VACUUM tenk2; +CLUSTER tenk2 USING tenk2_unique1; +SELECT pg_sleep(1.0); + pg_sleep +-- + +(1 row) + +SELECT st.last_vacuum > pr.last_vacuum, + st.last_analyze > pr.last_analyze, + st.last_cluster > pr.last_cluster, + st.vacuum_count > pr.vacuum_count, + st.analyze_count > pr.analyze_count, + st.cluster_count > pr.cluster_count + FROM pg_stat_user_tables AS st, prevstats pr + WHERE st.relname='tenk2'; + ?column? | ?column? | ?column? | ?column? | ?column? | ?column? +--+--+--+--+--+-- + t| t| t| t| t| t +(1 row) + -- End of Stats Test diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql index bb349b2..71e5e27 100644 --- a/src/test/regress/sql/stats.sql +++ b/src/test/regress/sql/stats.sql @@ -22,7 +22,13 @@ SELECT pg_sleep(2.0); CREATE TEMP TABLE prevstats AS SELECT t.seq_scan, t.seq_tup_read, t.idx_scan, t.idx_tup_fetch, (b.heap_blks_read + b.heap_blks_hit) AS heap_blks, - (b.idx_blks_read + b.idx_blks_hit) AS idx_blks + (b.idx_blks_read + b.idx_blks_hit) AS idx_blks, + coalesce(t.last_vacuum, now()) AS last_vacuum, + coalesce(t.last_analyze, now()) AS last_analyze, + coalesce(t.last_cluster, now()) AS last_cluster, + t.vacuum_count, + t.analyze_count, + t.cluster_count FROM pg_catalog.pg_stat_user_tables AS t, pg_catalog.pg_statio_user_tables AS b WHERE t.relname='tenk2' AND b.relname='tenk2'; @@ -81,4 +87,20 @@ SELECT st.heap_blks_read + st.heap_blks_hit >= pr.heap_blks + cl.relpages, FROM pg_statio_user_tables AS st, pg_class AS cl, prevstats AS pr WHERE st.relname='tenk2' AND cl.relname='tenk2'; +-- table maintenance stats +ANALYZE tenk2; +VACUUM tenk2; +CLUSTER tenk2 USING tenk2_unique1; + +SELECT pg_sleep(1.0); + +SELECT st.last_vacuum > pr.last_vacuum, + st.last_analyze > pr.last_analyze, + st.last_cluster > pr.last_cluster, + st.vacuum_count > pr.vacuum_count, + st.analyze_count > pr.analyze_count, + st.cluster_count > pr.cluster_count + FROM pg_stat_user_tables AS st, prevstats pr + WHERE st.relname='tenk2'; + -- End of Stats Test -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Statistics collection for CLUSTER command
On 08/09/2013 10:37 PM, Stefan Kaltenbrunner wrote: >>> On 08/08/2013 01:52 PM, Vik Fearing wrote: I would add this to the next commitfest but I seem to be unable to log in with my community account (I can log in to the wiki). Help appreciated. > hmm looks like your account may be affected by one of the buglets > introduced (and fixed shortly afterwards) of the main infrastructure to > debian wheezy - please try logging in to the main website and change > your password at least once. That should make it working again for the > commitfest app... That worked. Thank you. Vik -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Statistics collection for CLUSTER command
On 08/09/2013 12:02 AM, Vik Fearing wrote: > On 08/08/2013 07:57 PM, Stefan Kaltenbrunner wrote: > >> On 08/08/2013 01:52 PM, Vik Fearing wrote: >>> I would add this to the next commitfest but I seem to be unable to log >>> in with my community account (I can log in to the wiki). Help appreciated. >> whould be a bit easier to diagnose if we knew your community account name > > Sorry, it's "glaucous". hmm looks like your account may be affected by one of the buglets introduced (and fixed shortly afterwards) of the main infrastructure to debian wheezy - please try logging in to the main website and change your password at least once. That should make it working again for the commitfest app... Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Statistics collection for CLUSTER command
Thank you, but it seems you've duplicated the title from the other patch (and thanks for adding that one, too!). Indeed, possibly a wrong copy paste. Fixed. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Statistics collection for CLUSTER command
On 08/08/2013 02:26 PM, Fabien COELHO wrote: > >> As part of routine maintenance monitoring, it is interesting for us to >> have statistics on the CLUSTER command (timestamp of last run, and >> number of runs since stat reset) like we have for (auto)ANALYZE and >> (auto)VACUUM. Patch against today's HEAD attached. >> >> I would add this to the next commitfest but I seem to be unable to log >> in with my community account (I can log in to the wiki). Help >> appreciated. > > Done. > Thank you, but it seems you've duplicated the title from the other patch (and thanks for adding that one, too!). https://commitfest.postgresql.org/action/patch_view?id=1190 Vik -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Statistics collection for CLUSTER command
On 08/08/2013 07:57 PM, Stefan Kaltenbrunner wrote: > On 08/08/2013 01:52 PM, Vik Fearing wrote: >> I would add this to the next commitfest but I seem to be unable to log >> in with my community account (I can log in to the wiki). Help appreciated. > whould be a bit easier to diagnose if we knew your community account name Sorry, it's "glaucous". Vik
Re: [HACKERS] [PATCH] Statistics collection for CLUSTER command
On 08/08/2013 01:52 PM, Vik Fearing wrote: > As part of routine maintenance monitoring, it is interesting for us to > have statistics on the CLUSTER command (timestamp of last run, and > number of runs since stat reset) like we have for (auto)ANALYZE and > (auto)VACUUM. Patch against today's HEAD attached. > > I would add this to the next commitfest but I seem to be unable to log > in with my community account (I can log in to the wiki). Help appreciated. whould be a bit easier to diagnose if we knew your community account name ;) Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Statistics collection for CLUSTER command
As part of routine maintenance monitoring, it is interesting for us to have statistics on the CLUSTER command (timestamp of last run, and number of runs since stat reset) like we have for (auto)ANALYZE and (auto)VACUUM. Patch against today's HEAD attached. I would add this to the next commitfest but I seem to be unable to log in with my community account (I can log in to the wiki). Help appreciated. Done. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers