Re: [PATCH] pg_stat_toast v6
Am 03.01.22 um 22:23 schrieb Alvaro Herrera: Overall I think this is a good feature to have; assessing the need for compression is important for tuning, so +1 for the goal of the patch. Much appreciated! I didn't look into the patch carefully, but here are some minor comments: On 2022-Jan-03, Gunnar "Nick" Bluth wrote: @@ -229,7 +230,9 @@ toast_tuple_try_compression(ToastTupleContext *ttc, int attribute) Datum *value = >ttc_values[attribute]; Datum new_value; ToastAttrInfo *attr = >ttc_attr[attribute]; + instr_time start_time; + INSTR_TIME_SET_CURRENT(start_time); new_value = toast_compress_datum(*value, attr->tai_compression); if (DatumGetPointer(new_value) != NULL) Don't INSTR_TIME_SET_CURRENT unconditionally; in some systems it's an expensive syscall. Find a way to only do it if the feature is enabled. Yeah, I was worried about that (and asking if it would be required) already. Adding the check was easier than I expected, though I'm absolutely clueless if I did it right! #include "pgstat.h" extern PGDLLIMPORT bool pgstat_track_toast; This also suggests that perhaps it'd be a good idea to allow this to be enabled for specific tables only, rather than system-wide. (Maybe in order for stats to be collected, the user should have to both set the GUC option *and* set a per-table option? Not sure.) That would of course be nice, but I seriously doubt the required additional logic would be justified. The patch currently tampers with as few internal structures as possible, and for good reason... ;-) @@ -82,10 +82,12 @@ typedef enum StatMsgType PGSTAT_MTYPE_DEADLOCK, PGSTAT_MTYPE_CHECKSUMFAILURE, PGSTAT_MTYPE_REPLSLOT, + PGSTAT_MTYPE_CONNECTION, I think this new enum value doesn't belong in this patch. Yeah, did I mention I'm struggling with rebasing? ;-| +/* -- + * PgStat_ToastEntry Per-TOAST-column info in a MsgFuncstat + * -- Not in "a MsgFuncstat", right? Obviously... fixed! +-- function to wait for counters to advance +create function wait_for_stats() returns void as $$ I don't think we want a separate copy of wait_for_stats; see commit fe60b67250a3 and the discussion leading to it. Maybe you'll want to move the test to stats.sql. I'm not sure what to say about relying on Did so. LZ4; maybe you'll want to leave that part out, and just verify in an LZ4-enabled build that some 'l' entry exists. BTW, don't we have any decent way to turn that 'l' into a more reasonable, descriptive string? Also, perhaps make the view-defining query turn an empty compression method into whatever the default is. I'm not even sure that having it in there is useful at all. It's simply JOINed in from pg_attribute. Which is where I'd see that "make it look nicer" change happening, tbth. ;-) Or even better, stats collection should store the real compression method used rather than empty string, to avoid confusing things when some stats are collected, then the default is changed, then some more stats are collected. I was thinking about that already, but came to the conclusion that it a) would blow up the size of these statistics by quite a bit and b) would be quite tricky to display in a useful way. I mean, the use case of track_toast is pretty limited anyway; you'll probably turn this feature on with a specific column in mind, of which you'll probably know which compression method is used at the time. Thanks for the feedback! v7 attached. -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - CatoFrom f07213d68c646ba64757e551e3587aab0ff221df Mon Sep 17 00:00:00 2001 From: "Gunnar \"Nick\" Bluth" Date: Tue, 4 Jan 2022 12:08:32 +0100 Subject: [PATCH v7] pg_stat_toast v7 --- doc/src/sgml/config.sgml | 26 ++ doc/src/sgml/monitoring.sgml | 163 ++ doc/src/sgml/storage.sgml | 12 +- src/backend/access/table/toast_helper.c | 44 ++- src/backend/catalog/system_views.sql | 20 ++ src/backend/postmaster/pgstat.c | 305 +- src/backend/utils/adt/pgstatfuncs.c | 72 + src/backend/utils/misc/guc.c | 9 + src/backend/utils/misc/postgresql.conf.sample | 1 + src/include/catalog/pg_proc.dat | 25 ++ src/include/pgstat.h | 107 ++ src/test/regress/expected/rules.out | 17 + src/test/regress/expected/stats.out | 62 src/test/regress/sql/stats.sql| 28 ++ 14 files changed, 882 insertions(+), 9 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index afbb6c35e3..b2617cc941 100644 --- a/doc/src/sgml/config.sgml +++
Re: [PATCH] pg_stat_toast v6
Am 03.01.22 um 22:03 schrieb Justin Pryzby: +pgstat_report_toast_activity(Oid relid, int attr, + bool externalized, + bool compressed, + int32 old_size, + int32 new_size, ... + if (new_size) + { + htabent->t_counts.t_size_orig+=old_size; + if (new_size) + { I guess one of these is supposed to say old_size? Didn't make a difference, tbth, as they'd both be 0 or have a value. Streamlined the whole block now. +CREATE TABLE toast_test (cola TEXT, colb TEXT COMPRESSION lz4, colc TEXT , cold TEXT, cole TEXT); Is there a reason this uses lz4 ? I thought it might help later on, but alas! the LZ4 column mainly broke things, so I removed it for the time being. If that's needed for stable results, I think you should use pglz, since that's what's guaranteed to exist. I imagine LZ4 won't be required any time soon, seeing as zlib has never been required. Yeah. It didn't prove anything whatsoever. +Be aware that this feature, depending on the amount of TOASTable columns in +your databases, may significantly increase the size of the statistics files +and the workload of the statistics collector. It is recommended to only +temporarily activate this to assess the right compression and storage method +for (a) column(s). saying "a column" is fine Changed. + schemaname name + Attribute (column) number in the relation + relname name + + compressmethod char + + + Compression method of the attribute (empty means default) One thing to keep in mind is that the current compression method is only used for *new* data - old data can still use the old compression method. It probably doesn't need to be said here, but maybe you can refer to the docs about that in alter_table. + Number of times the compression was successful (gained a size reduction) It's more clear to say "was reduced in size" Changed the wording a bit, I guess it is clear enough now. The question is if the column should be there at all, as it's simply fetched from pg_attribute... + /* we assume this inits to all zeroes: */ + static const PgStat_ToastCounts all_zeroes; You don't have to assume; static/global allocations are always zero unless otherwise specified. Copy-pasta ;-) Removed. Thx for looking into this! Patch v7 will be in the next mail. -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - Cato
Re: [PATCH] pg_stat_toast v6
Overall I think this is a good feature to have; assessing the need for compression is important for tuning, so +1 for the goal of the patch. I didn't look into the patch carefully, but here are some minor comments: On 2022-Jan-03, Gunnar "Nick" Bluth wrote: > @@ -229,7 +230,9 @@ toast_tuple_try_compression(ToastTupleContext *ttc, int > attribute) > Datum *value = >ttc_values[attribute]; > Datum new_value; > ToastAttrInfo *attr = >ttc_attr[attribute]; > + instr_time start_time; > > + INSTR_TIME_SET_CURRENT(start_time); > new_value = toast_compress_datum(*value, attr->tai_compression); > > if (DatumGetPointer(new_value) != NULL) Don't INSTR_TIME_SET_CURRENT unconditionally; in some systems it's an expensive syscall. Find a way to only do it if the feature is enabled. This also suggests that perhaps it'd be a good idea to allow this to be enabled for specific tables only, rather than system-wide. (Maybe in order for stats to be collected, the user should have to both set the GUC option *and* set a per-table option? Not sure.) > @@ -82,10 +82,12 @@ typedef enum StatMsgType > PGSTAT_MTYPE_DEADLOCK, > PGSTAT_MTYPE_CHECKSUMFAILURE, > PGSTAT_MTYPE_REPLSLOT, > + PGSTAT_MTYPE_CONNECTION, I think this new enum value doesn't belong in this patch. > +/* -- > + * PgStat_ToastEntry Per-TOAST-column info in a MsgFuncstat > + * -- Not in "a MsgFuncstat", right? > +-- function to wait for counters to advance > +create function wait_for_stats() returns void as $$ I don't think we want a separate copy of wait_for_stats; see commit fe60b67250a3 and the discussion leading to it. Maybe you'll want to move the test to stats.sql. I'm not sure what to say about relying on LZ4; maybe you'll want to leave that part out, and just verify in an LZ4-enabled build that some 'l' entry exists. BTW, don't we have any decent way to turn that 'l' into a more reasonable, descriptive string? Also, perhaps make the view-defining query turn an empty compression method into whatever the default is. Or even better, stats collection should store the real compression method used rather than empty string, to avoid confusing things when some stats are collected, then the default is changed, then some more stats are collected. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "Para tener más hay que desear menos"
Re: [PATCH] pg_stat_toast v6
> +pgstat_report_toast_activity(Oid relid, int attr, > + bool externalized, > + bool compressed, > + int32 old_size, > + int32 new_size, ... > + if (new_size) > + { > + htabent->t_counts.t_size_orig+=old_size; > + if (new_size) > + { I guess one of these is supposed to say old_size? > + _track_toast, > + false, > + NULL, NULL, NULL > + }, > { > +CREATE TABLE toast_test (cola TEXT, colb TEXT COMPRESSION lz4, colc TEXT , > cold TEXT, cole TEXT); Is there a reason this uses lz4 ? If that's needed for stable results, I think you should use pglz, since that's what's guaranteed to exist. I imagine LZ4 won't be required any time soon, seeing as zlib has never been required. > +Be aware that this feature, depending on the amount of TOASTable > columns in > +your databases, may significantly increase the size of the > statistics files > +and the workload of the statistics collector. It is recommended to > only > +temporarily activate this to assess the right compression and > storage method > +for (a) column(s). saying "a column" is fine > + schemaname name > + Attribute (column) number in the relation > + relname name > + > + compressmethod char > + > + > + Compression method of the attribute (empty means default) One thing to keep in mind is that the current compression method is only used for *new* data - old data can still use the old compression method. It probably doesn't need to be said here, but maybe you can refer to the docs about that in alter_table. > + Number of times the compression was successful (gained a size > reduction) It's more clear to say "was reduced in size" > + /* we assume this inits to all zeroes: */ > + static const PgStat_ToastCounts all_zeroes; You don't have to assume; static/global allocations are always zero unless otherwise specified.
Re: [PATCH] pg_stat_toast v6
On Mon, Jan 03, 2022 at 08:40:50PM +0100, Gunnar "Nick" Bluth wrote: > I wonder why "track_toast.sql" test fails on Windows (with "ERROR: > compression method lz4 not supported"), but "compression.sql" doesn't. > Any hints? The windows CI doesn't have LZ4, so the SQL command fails, but there's an "alternate" expected/compression_1.out so that's accepted. (The regression tests exercise many commands which fail, as expected, like creating an index on an index). If you're going have an alternate file for the --without-lz4 case, then I think you should put it into compression.sql. (But not if you needed an alternate for something else, since we'd need 4 alternates, which is halfway to 8...). -- Justin
Re: [PATCH] pg_stat_toast v6
Am 03.01.22 um 20:11 schrieb Alvaro Herrera: On 2022-Jan-03, Gunnar "Nick" Bluth wrote: 9:38 $ git format-patch PGDG/master -v5 -o .. ../v5-0001-ping-pong-of-thougths.patch ../v5-0002-ping-pong-of-thougths.patch ../v5-0003-adds-some-debugging-messages-in-toast_helper.c.patch ... Hmm, in such cases I would suggest to create a separate branch and then "git merge --squash" for submission. You can keep your development branch separate, with other merges if you want. I've found this to be easier to manage, though I don't always follow that workflow myself. Using --stdout does help ;-) I wonder why "track_toast.sql" test fails on Windows (with "ERROR: compression method lz4 not supported"), but "compression.sql" doesn't. Any hints? Anyway, I shamelessly copied "wait_for_stats()" from the "stats.sql" file and the tests _should_ now work at least on the platforms with lz4. v6 attached! -- Gunnar "Nick" Bluth Eimermacherweg 106 D-48159 Münster Mobil +49 172 8853339 Email: gunnar.bl...@pro-open.de __ "Ceterum censeo SystemD esse delendam" - CatoFrom e743587fbd8f6592bbfa15f53733f79c405000e2 Mon Sep 17 00:00:00 2001 From: "Gunnar \"Nick\" Bluth" Date: Mon, 3 Jan 2022 20:35:05 +0100 Subject: [PATCH v6] pg_stat_toast v6 --- doc/src/sgml/config.sgml | 26 ++ doc/src/sgml/monitoring.sgml | 163 + doc/src/sgml/storage.sgml | 12 +- src/backend/access/table/toast_helper.c | 24 ++ src/backend/catalog/system_views.sql | 20 ++ src/backend/postmaster/pgstat.c | 309 +- src/backend/utils/adt/pgstatfuncs.c | 72 src/backend/utils/misc/guc.c | 9 + src/backend/utils/misc/postgresql.conf.sample | 1 + src/include/catalog/pg_proc.dat | 25 ++ src/include/pgstat.h | 108 ++ src/test/regress/expected/rules.out | 17 + src/test/regress/expected/track_toast.out | 102 ++ src/test/regress/parallel_schedule| 2 +- src/test/regress/sql/track_toast.sql | 64 15 files changed, 946 insertions(+), 8 deletions(-) create mode 100644 src/test/regress/expected/track_toast.out create mode 100644 src/test/regress/sql/track_toast.sql diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index afbb6c35e3..fa40befc16 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -7668,6 +7668,32 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; + + track_toast (boolean) + + track_toast configuration parameter + + + + +Enables tracking of TOAST activities. +Compressions and externalizations are tracked. +The default is off. +Only superusers can change this setting. + + + + +Be aware that this feature, depending on the amount of TOASTable columns in +your databases, may significantly increase the size of the statistics files +and the workload of the statistics collector. It is recommended to only +temporarily activate this to assess the right compression and storage method +for (a) column(s). + + + + + stats_temp_directory (string) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 62f2a3332b..32d7818096 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -610,6 +610,17 @@ postgres 27093 0.0 0.0 30096 2752 ?Ss 11:34 0:00 postgres: ser yet included in pg_stat_user_functions). + + pg_stat_toastpg_stat_toast + + One row for each column that has ever been TOASTed (compressed and/or externalized). + Showing the number of externalizations, compression attempts / successes, compressed and + uncompressed sizes etc. + + pg_stat_toast for details. + + + pg_stat_slrupg_stat_slru One row per SLRU, showing statistics of operations. See @@ -4969,6 +4980,158 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i + + pg_stat_toast + + + pg_stat_toast + + + + The pg_stat_toast view will contain + one row for each column of variable size that has been TOASTed since + the last statistics reset. The parameter + controls whether TOAST activities are tracked or not. + + + + pg_stat_toast View + + + + + Column Type + + + Description + + + + + + + + schemaname name + + + Name of the schema the relation is in + + + + + + reloid oid + + + OID of the relation + + + + + + attnum int + + +