Re: [PATCH] pg_stat_toast v6

2022-01-04 Thread Gunnar "Nick" Bluth

Am 03.01.22 um 22:23 schrieb Alvaro Herrera:

Overall I think this is a good feature to have; assessing the need for
compression is important for tuning, so +1 for the goal of the patch.


Much appreciated!



I didn't look into the patch carefully, but here are some minor
comments:

On 2022-Jan-03, Gunnar "Nick" Bluth wrote:


@@ -229,7 +230,9 @@ toast_tuple_try_compression(ToastTupleContext *ttc, int 
attribute)
Datum  *value = >ttc_values[attribute];
Datum   new_value;
ToastAttrInfo *attr = >ttc_attr[attribute];
+   instr_time  start_time;
  
+	INSTR_TIME_SET_CURRENT(start_time);

new_value = toast_compress_datum(*value, attr->tai_compression);
  
  	if (DatumGetPointer(new_value) != NULL)


Don't INSTR_TIME_SET_CURRENT unconditionally; in some systems it's an
expensive syscall.  Find a way to only do it if the feature is enabled.


Yeah, I was worried about that (and asking if it would be required) already.
Adding the check was easier than I expected, though I'm absolutely 
clueless if I did it right!


#include "pgstat.h"
extern PGDLLIMPORT bool pgstat_track_toast;



This also suggests that perhaps it'd be a good idea to allow this to be
enabled for specific tables only, rather than system-wide.  (Maybe in
order for stats to be collected, the user should have to both set the
GUC option *and* set a per-table option?  Not sure.)


That would of course be nice, but I seriously doubt the required 
additional logic would be justified. The patch currently tampers with as 
few internal structures as possible, and for good reason... ;-)



@@ -82,10 +82,12 @@ typedef enum StatMsgType
PGSTAT_MTYPE_DEADLOCK,
PGSTAT_MTYPE_CHECKSUMFAILURE,
PGSTAT_MTYPE_REPLSLOT,
+   PGSTAT_MTYPE_CONNECTION,


I think this new enum value doesn't belong in this patch.


Yeah, did I mention I'm struggling with rebasing? ;-|



+/* --
+ * PgStat_ToastEntry   Per-TOAST-column info in a MsgFuncstat
+ * --


Not in "a MsgFuncstat", right?


Obviously... fixed!




+-- function to wait for counters to advance
+create function wait_for_stats() returns void as $$


I don't think we want a separate copy of wait_for_stats; see commit
fe60b67250a3 and the discussion leading to it.  Maybe you'll want to
move the test to stats.sql.  I'm not sure what to say about relying on


Did so.


LZ4; maybe you'll want to leave that part out, and just verify in an
LZ4-enabled build that some 'l' entry exists.  BTW, don't we have any
decent way to turn that 'l' into a more reasonable, descriptive string?



Also, perhaps make the view-defining query turn an empty compression
method into whatever the default is.


I'm not even sure that having it in there is useful at all. It's simply 
JOINed in from pg_attribute.

Which is where I'd see that "make it look nicer" change happening, tbth. ;-)


Or even better, stats collection
should store the real compression method used rather than empty string,
to avoid confusing things when some stats are collected, then the
default is changed, then some more stats are collected.


I was thinking about that already, but came to the conclusion that it a) 
would blow up the size of these statistics by quite a bit and b) would 
be quite tricky to display in a useful way.


I mean, the use case of track_toast is pretty limited anyway; you'll 
probably turn this feature on with a specific column in mind, of which 
you'll probably know which compression method is used at the time.


Thanks for the feedback!
v7 attached.
--
Gunnar "Nick" Bluth

Eimermacherweg 106
D-48159 Münster

Mobil +49 172 8853339
Email: gunnar.bl...@pro-open.de
__
"Ceterum censeo SystemD esse delendam" - CatoFrom f07213d68c646ba64757e551e3587aab0ff221df Mon Sep 17 00:00:00 2001
From: "Gunnar \"Nick\" Bluth" 
Date: Tue, 4 Jan 2022 12:08:32 +0100
Subject: [PATCH v7] pg_stat_toast v7

---
 doc/src/sgml/config.sgml  |  26 ++
 doc/src/sgml/monitoring.sgml  | 163 ++
 doc/src/sgml/storage.sgml |  12 +-
 src/backend/access/table/toast_helper.c   |  44 ++-
 src/backend/catalog/system_views.sql  |  20 ++
 src/backend/postmaster/pgstat.c   | 305 +-
 src/backend/utils/adt/pgstatfuncs.c   |  72 +
 src/backend/utils/misc/guc.c  |   9 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/catalog/pg_proc.dat   |  25 ++
 src/include/pgstat.h  | 107 ++
 src/test/regress/expected/rules.out   |  17 +
 src/test/regress/expected/stats.out   |  62 
 src/test/regress/sql/stats.sql|  28 ++
 14 files changed, 882 insertions(+), 9 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index afbb6c35e3..b2617cc941 100644
--- a/doc/src/sgml/config.sgml
+++ 

Re: [PATCH] pg_stat_toast v6

2022-01-04 Thread Gunnar "Nick" Bluth

Am 03.01.22 um 22:03 schrieb Justin Pryzby:

+pgstat_report_toast_activity(Oid relid, int attr,
+   bool externalized,
+   bool compressed,
+   int32 old_size,
+   int32 new_size,

...

+   if (new_size)
+   {
+   htabent->t_counts.t_size_orig+=old_size;
+   if (new_size)
+   {


I guess one of these is supposed to say old_size?


Didn't make a difference, tbth, as they'd both be 0 or have a value. 
Streamlined the whole block now.




+CREATE TABLE toast_test (cola TEXT, colb TEXT COMPRESSION lz4, colc TEXT , 
cold TEXT, cole TEXT);


Is there a reason this uses lz4 ?


I thought it might help later on, but alas! the LZ4 column mainly broke 
things, so I removed it for the time being.



If that's needed for stable results, I think you should use pglz, since that's
what's guaranteed to exist.  I imagine LZ4 won't be required any time soon,
seeing as zlib has never been required.


Yeah. It didn't prove anything whatsoever.


+Be aware that this feature, depending on the amount of TOASTable 
columns in
+your databases, may significantly increase the size of the statistics 
files
+and the workload of the statistics collector. It is recommended to only
+temporarily activate this to assess the right compression and storage 
method
+for (a) column(s).


saying "a column" is fine


Changed.




+   schemaname name
+   Attribute (column) number in the relation
+   relname name



+  
+   compressmethod char
+  
+  
+   Compression method of the attribute (empty means default)


One thing to keep in mind is that the current compression method is only used
for *new* data - old data can still use the old compression method.  It
probably doesn't need to be said here, but maybe you can refer to the docs
about that in alter_table.


+   Number of times the compression was successful (gained a size reduction)


It's more clear to say "was reduced in size"


Changed the wording a bit, I guess it is clear enough now.
The question is if the column should be there at all, as it's simply 
fetched from pg_attribute...





+   /* we assume this inits to all zeroes: */
+   static const PgStat_ToastCounts all_zeroes;


You don't have to assume; static/global allocations are always zero unless
otherwise specified.


Copy-pasta ;-)
Removed.

Thx for looking into this!
Patch v7 will be in the next mail.

--
Gunnar "Nick" Bluth

Eimermacherweg 106
D-48159 Münster

Mobil +49 172 8853339
Email: gunnar.bl...@pro-open.de
__
"Ceterum censeo SystemD esse delendam" - Cato




Re: [PATCH] pg_stat_toast v6

2022-01-03 Thread Alvaro Herrera
Overall I think this is a good feature to have; assessing the need for
compression is important for tuning, so +1 for the goal of the patch.

I didn't look into the patch carefully, but here are some minor
comments:

On 2022-Jan-03, Gunnar "Nick" Bluth wrote:

> @@ -229,7 +230,9 @@ toast_tuple_try_compression(ToastTupleContext *ttc, int 
> attribute)
>   Datum  *value = >ttc_values[attribute];
>   Datum   new_value;
>   ToastAttrInfo *attr = >ttc_attr[attribute];
> + instr_time  start_time;
>  
> + INSTR_TIME_SET_CURRENT(start_time);
>   new_value = toast_compress_datum(*value, attr->tai_compression);
>  
>   if (DatumGetPointer(new_value) != NULL)

Don't INSTR_TIME_SET_CURRENT unconditionally; in some systems it's an
expensive syscall.  Find a way to only do it if the feature is enabled.
This also suggests that perhaps it'd be a good idea to allow this to be
enabled for specific tables only, rather than system-wide.  (Maybe in
order for stats to be collected, the user should have to both set the
GUC option *and* set a per-table option?  Not sure.)

> @@ -82,10 +82,12 @@ typedef enum StatMsgType
>   PGSTAT_MTYPE_DEADLOCK,
>   PGSTAT_MTYPE_CHECKSUMFAILURE,
>   PGSTAT_MTYPE_REPLSLOT,
> + PGSTAT_MTYPE_CONNECTION,

I think this new enum value doesn't belong in this patch.

> +/* --
> + * PgStat_ToastEntry Per-TOAST-column info in a MsgFuncstat
> + * --

Not in "a MsgFuncstat", right?

> +-- function to wait for counters to advance
> +create function wait_for_stats() returns void as $$

I don't think we want a separate copy of wait_for_stats; see commit
fe60b67250a3 and the discussion leading to it.  Maybe you'll want to
move the test to stats.sql.  I'm not sure what to say about relying on
LZ4; maybe you'll want to leave that part out, and just verify in an
LZ4-enabled build that some 'l' entry exists.  BTW, don't we have any
decent way to turn that 'l' into a more reasonable, descriptive string?

Also, perhaps make the view-defining query turn an empty compression
method into whatever the default is.  Or even better, stats collection
should store the real compression method used rather than empty string,
to avoid confusing things when some stats are collected, then the
default is changed, then some more stats are collected.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Para tener más hay que desear menos"




Re: [PATCH] pg_stat_toast v6

2022-01-03 Thread Justin Pryzby
> +pgstat_report_toast_activity(Oid relid, int attr,
> + bool externalized,
> + bool compressed,
> + int32 old_size,
> + int32 new_size,
...
> + if (new_size)
> + {
> + htabent->t_counts.t_size_orig+=old_size;
> + if (new_size)
> + {

I guess one of these is supposed to say old_size?

> + _track_toast,
> + false,
> + NULL, NULL, NULL
> + },
>   {

> +CREATE TABLE toast_test (cola TEXT, colb TEXT COMPRESSION lz4, colc TEXT , 
> cold TEXT, cole TEXT);

Is there a reason this uses lz4 ?
If that's needed for stable results, I think you should use pglz, since that's
what's guaranteed to exist.  I imagine LZ4 won't be required any time soon,
seeing as zlib has never been required.

> +Be aware that this feature, depending on the amount of TOASTable 
> columns in
> +your databases, may significantly increase the size of the 
> statistics files
> +and the workload of the statistics collector. It is recommended to 
> only
> +temporarily activate this to assess the right compression and 
> storage method
> +for (a) column(s).

saying "a column" is fine

> +   schemaname name
> +   Attribute (column) number in the relation
> +   relname name

> +  
> +   compressmethod char
> +  
> +  
> +   Compression method of the attribute (empty means default)

One thing to keep in mind is that the current compression method is only used
for *new* data - old data can still use the old compression method.  It
probably doesn't need to be said here, but maybe you can refer to the docs
about that in alter_table.

> +   Number of times the compression was successful (gained a size 
> reduction)

It's more clear to say "was reduced in size"

> + /* we assume this inits to all zeroes: */
> + static const PgStat_ToastCounts all_zeroes;

You don't have to assume; static/global allocations are always zero unless
otherwise specified.




Re: [PATCH] pg_stat_toast v6

2022-01-03 Thread Justin Pryzby
On Mon, Jan 03, 2022 at 08:40:50PM +0100, Gunnar "Nick" Bluth wrote:
> I wonder why "track_toast.sql" test fails on Windows (with "ERROR:
> compression method lz4 not supported"), but "compression.sql" doesn't.
> Any hints?

The windows CI doesn't have LZ4, so the SQL command fails, but there's an
"alternate" expected/compression_1.out so that's accepted.  (The regression
tests exercise many commands which fail, as expected, like creating an index on
an index).

If you're going have an alternate file for the --without-lz4 case, then I think
you should put it into compression.sql.  (But not if you needed an alternate
for something else, since we'd need 4 alternates, which is halfway to 8...).

-- 
Justin




Re: [PATCH] pg_stat_toast v6

2022-01-03 Thread Gunnar "Nick" Bluth

Am 03.01.22 um 20:11 schrieb Alvaro Herrera:

On 2022-Jan-03, Gunnar "Nick" Bluth wrote:


9:38 $ git format-patch PGDG/master -v5 -o ..
../v5-0001-ping-pong-of-thougths.patch
../v5-0002-ping-pong-of-thougths.patch
../v5-0003-adds-some-debugging-messages-in-toast_helper.c.patch
  ...


Hmm, in such cases I would suggest to create a separate branch and then
"git merge --squash" for submission.  You can keep your development
branch separate, with other merges if you want.

I've found this to be easier to manage, though I don't always follow
that workflow myself.



Using --stdout does help ;-)

I wonder why "track_toast.sql" test fails on Windows (with "ERROR: 
compression method lz4 not supported"), but "compression.sql" doesn't.

Any hints?

Anyway, I shamelessly copied "wait_for_stats()" from the "stats.sql" 
file and the tests _should_ now work at least on the platforms with lz4.


v6 attached!
--
Gunnar "Nick" Bluth

Eimermacherweg 106
D-48159 Münster

Mobil +49 172 8853339
Email: gunnar.bl...@pro-open.de
__
"Ceterum censeo SystemD esse delendam" - CatoFrom e743587fbd8f6592bbfa15f53733f79c405000e2 Mon Sep 17 00:00:00 2001
From: "Gunnar \"Nick\" Bluth" 
Date: Mon, 3 Jan 2022 20:35:05 +0100
Subject: [PATCH v6] pg_stat_toast v6

---
 doc/src/sgml/config.sgml  |  26 ++
 doc/src/sgml/monitoring.sgml  | 163 +
 doc/src/sgml/storage.sgml |  12 +-
 src/backend/access/table/toast_helper.c   |  24 ++
 src/backend/catalog/system_views.sql  |  20 ++
 src/backend/postmaster/pgstat.c   | 309 +-
 src/backend/utils/adt/pgstatfuncs.c   |  72 
 src/backend/utils/misc/guc.c  |   9 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/catalog/pg_proc.dat   |  25 ++
 src/include/pgstat.h  | 108 ++
 src/test/regress/expected/rules.out   |  17 +
 src/test/regress/expected/track_toast.out | 102 ++
 src/test/regress/parallel_schedule|   2 +-
 src/test/regress/sql/track_toast.sql  |  64 
 15 files changed, 946 insertions(+), 8 deletions(-)
 create mode 100644 src/test/regress/expected/track_toast.out
 create mode 100644 src/test/regress/sql/track_toast.sql

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index afbb6c35e3..fa40befc16 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7668,6 +7668,32 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
   
  
 
+ 
+  track_toast (boolean)
+  
+   track_toast configuration parameter
+  
+  
+  
+   
+Enables tracking of TOAST activities.
+Compressions and externalizations are tracked.
+The default is off.
+Only superusers can change this setting.
+   
+
+   
+
+Be aware that this feature, depending on the amount of TOASTable columns in
+your databases, may significantly increase the size of the statistics files
+and the workload of the statistics collector. It is recommended to only
+temporarily activate this to assess the right compression and storage method
+for (a) column(s).
+
+   
+  
+ 
+
  
   stats_temp_directory (string)
   
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 62f2a3332b..32d7818096 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -610,6 +610,17 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34   0:00 postgres: ser
   yet included in pg_stat_user_functions).
  
 
+ 
+  pg_stat_toastpg_stat_toast
+  
+   One row for each column that has ever been TOASTed (compressed and/or externalized).
+   Showing the number of externalizations, compression attempts / successes, compressed and
+   uncompressed sizes etc.
+   
+   pg_stat_toast for details.
+  
+ 
+
  
   pg_stat_slrupg_stat_slru
   One row per SLRU, showing statistics of operations. See
@@ -4969,6 +4980,158 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
 
  
 
+ 
+  pg_stat_toast
+
+  
+   pg_stat_toast
+  
+
+  
+   The pg_stat_toast view will contain
+   one row for each column of variable size that has been TOASTed since 
+   the last statistics reset. The  parameter
+   controls whether TOAST activities are tracked or not.
+  
+
+  
+   pg_stat_toast View
+   
+
+ 
+  
+   Column Type
+  
+  
+   Description
+  
+ 
+
+
+
+ 
+  
+   schemaname name
+  
+  
+   Name of the schema the relation is in
+  
+ 
+
+ 
+  
+   reloid oid
+  
+  
+   OID of the relation
+  
+ 
+
+ 
+  
+   attnum int
+  
+  
+