On Tue, Apr 02, 2019 at 02:35:19PM +0900, Michael Paquier wrote: > + compress_tuple_threshold = RelationGetCompressTupleTarget(relation, > + toast_tuple_threshold); > + compress_tuple_threshold = Min(compress_tuple_threshold, > + toast_tuple_threshold); > All the callers of RelationGetCompressTupleTarget directly compile the > minimum between compress_tuple_threshold and toast_tuple_threshold, > and also call RelationGetToastTupleTarget() beforehand. Wouldn't it > be better to merge all that in a common routine? The same calculation > method is duplicated 5 times.
I have been looking at this patch more, and here are some notes:
- The tests can be really simplified using directly reltoastrelid, so
I changed the queries this way. I am aware that the surroundings
hardcode directly the relation name, but that's not really elegant in
my opinion. And I am really tempted to adjust these as well to
directly use reltoastrelid.
- The docs had a weird indentation.
- I think that we should be careful with the portability of
pg_column_size(), so I have added comparisons instead of the direct
values in a way which does not change the meaning of the tests nor
their coverage.
- Having RelationGetCompressTupleTarget use directly
toast_tuple_threshold as default argument is I think kind of
confusing, so let's use a different static value, named
COMPRESS_TUPLE_TARGET in the attached. This is similar to
TOAST_TUPLE_TARGET for the toast tuple threshold.
- The comments in tuptoaster.h need to be updated to outline the
difference between the compression invocation and the toast invocation
thresholds. The wording could be better though.
- Better to avoid comments in the middle of the else/if blocks in my
opinion.
Also, the previous versions of the patch do that when doing a heap
insertion (heapam.c and rewriteheap.c):
+ toast_tuple_threshold = RelationGetToastTupleTarget(relation,
+ TOAST_TUPLE_THRESHOLD);
+ compress_tuple_threshold = RelationGetCompressTupleTarget(relation,
+ toast_tuple_threshold);
+ compress_tuple_threshold = Min(compress_tuple_threshold,
+ toast_tuple_threshold);
[...]
need_toast = (HeapTupleHasExternal(&oldtup) ||
HeapTupleHasExternal(newtup) ||
- newtup->t_len > TOAST_TUPLE_THRESHOLD);
+ newtup->t_len > compress_tuple_threshold);
This means that the original code always uses the compilation-time,
default value of toast_tuple_target for all relations. But this gets
changed so as we would use the value set at relation level for
toast_tuple_target if the reloption is changed, without touching
compress_tuple_threshold. This is out of the scope of this patch, but
shouldn't we always use the relation-level value instead of the
compiled one? Perhaps there is something I am missing?
--
Michael
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 0fcbc660b3..acb858fba1 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1343,6 +1343,32 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>compress_tuple_target</literal> (<type>integer</type>)</term>
+ <listitem>
+ <para>
+ The compress_tuple_target parameter specifies the minimum tuple length
+ required before we try to compress columns marked as Extended or Main
+ and applies only to new tuples - there is no effect on existing rows.
+ By default this parameter is set to allow at least 4 tuples per block,
+ which with the default blocksize will be 2040 bytes. Valid values are
+ between 128 bytes and (blocksize - header), by default 8160 bytes.
+ If the specified value is greater than
+ <literal>toast_tuple_target</literal>, then we will use the current
+ setting of <literal>toast_tuple_target</literal> for
+ <literal>compress_tuple_target</literal>. Note that the default setting
+ is often close to optimal. If the value is set too low then the
+ <acronym>TOAST</acronym> may get invoked too often. If the
+ compressibility of the field values is not good, then compression and
+ decompression can add significant computation overhead without
+ corresponding savings in storage consumption.
+ </para>
+ <para>
+ This parameter cannot be set for TOAST tables.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><literal>parallel_workers</literal> (<type>integer</type>)</term>
<listitem>
diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml
index 5df987f9c9..cfa0af571d 100644
--- a/doc/src/sgml/storage.sgml
+++ b/doc/src/sgml/storage.sgml
@@ -426,14 +426,17 @@ bytes regardless of the actual size of the represented value.
<para>
The <acronym>TOAST</acronym> management code is triggered only
when a row value to be stored in a table is wider than
-<symbol>TOAST_TUPLE_THRESHOLD</symbol> bytes (normally 2 kB).
-The <acronym>TOAST</acronym> code will compress and/or move
-field values out-of-line until the row value is shorter than
-<symbol>TOAST_TUPLE_TARGET</symbol> bytes (also normally 2 kB, adjustable)
-or no more gains can be had. During an UPDATE
+<symbol>TOAST_TUPLE_THRESHOLD</symbol> bytes (normally 2 kB) or the
+table-level <symbol>COMPRESS_TUPLE_TARGET</symbol> option. The
+<acronym>TOAST</acronym> code will first in-line compress field values
+greater than <symbol>COMPRESS_TUPLE_TARGET</symbol>. If the compressed row
+value is still greater than <symbol>TOAST_TUPLE_TARGET</symbol> then the
+<acronym>TOAST</acronym> code will move the field values out-of-line until
+the row value is shorter than <symbol>TOAST_TUPLE_TARGET</symbol> bytes (also
+normally 2 kB, adjustable) or no more gains can be had. During an UPDATE
operation, values of unchanged fields are normally preserved as-is; so an
-UPDATE of a row with out-of-line values incurs no <acronym>TOAST</acronym> costs if
-none of the out-of-line values change.
+UPDATE of a row with out-of-line values incurs no <acronym>TOAST</acronym>
+costs if none of the out-of-line values change.
</para>
<para>
@@ -485,7 +488,17 @@ with <link linkend="sql-altertable"><command>ALTER TABLE ... SET STORAGE</comman
<para>
<symbol>TOAST_TUPLE_TARGET</symbol> can be adjusted for each table using
-<link linkend="sql-altertable"><command>ALTER TABLE ... SET (toast_tuple_target = N)</command></link>
+<link linkend="sql-altertable"><command>ALTER TABLE ... SET (toast_tuple_target = N)</command></link>.
+</para>
+
+<para>
+<symbol>COMPRESS_TUPLE_TARGET</symbol> can be adjusted for each table using
+<link linkend="sql-altertable"><command>ALTER TABLE ... SET
+(compress_tuple_target = N)</command></link>. If
+<symbol>COMPRESS_TUPLE_TARGET</symbol> is not set explicitly for a table or if
+it is greater than <symbol>TOAST_TUPLE_TARGET</symbol> then
+<symbol>TOAST_TUPLE_TARGET</symbol> value is used to decide when to do
+compression.
</para>
<para>
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index b58a1f7a72..479f961a49 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -291,6 +291,15 @@ static relopt_int intRelOpts[] =
},
TOAST_TUPLE_TARGET, 128, TOAST_TUPLE_TARGET_MAIN
},
+ {
+ {
+ "compress_tuple_target",
+ "Sets the target tuple length at which columns will be compressed",
+ RELOPT_KIND_HEAP,
+ ShareUpdateExclusiveLock
+ },
+ COMPRESS_TUPLE_TARGET, 128, TOAST_TUPLE_TARGET_MAIN
+ },
{
{
"pages_per_range",
@@ -1377,6 +1386,8 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, log_min_duration)},
{"toast_tuple_target", RELOPT_TYPE_INT,
offsetof(StdRdOptions, toast_tuple_target)},
+ {"compress_tuple_target", RELOPT_TYPE_INT,
+ offsetof(StdRdOptions, compress_tuple_target)},
{"autovacuum_vacuum_cost_delay", RELOPT_TYPE_REAL,
offsetof(StdRdOptions, autovacuum) + offsetof(AutoVacOpts, vacuum_cost_delay)},
{"autovacuum_vacuum_scale_factor", RELOPT_TYPE_REAL,
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 05ceb6550d..01e8224fd4 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2053,6 +2053,9 @@ static HeapTuple
heap_prepare_insert(Relation relation, HeapTuple tup, TransactionId xid,
CommandId cid, int options)
{
+ int toast_tuple_threshold;
+ int compress_tuple_threshold;
+
/*
* Parallel operations are required to be strictly read-only in a parallel
* worker. Parallel inserts are not safe even in the leader in the
@@ -2077,9 +2080,25 @@ heap_prepare_insert(Relation relation, HeapTuple tup, TransactionId xid,
HeapTupleHeaderSetXmax(tup->t_data, 0); /* for cleanliness */
tup->t_tableOid = RelationGetRelid(relation);
+ /*
+ * Compute the compressibility threshold, which is the minimum between
+ * toast_tuple_target and compress_tuple_target for a relation.
+ */
+ toast_tuple_threshold = RelationGetToastTupleTarget(relation,
+ TOAST_TUPLE_TARGET);
+ compress_tuple_threshold = RelationGetCompressTupleTarget(relation,
+ COMPRESS_TUPLE_TARGET);
+ compress_tuple_threshold = Min(compress_tuple_threshold,
+ toast_tuple_threshold);
+
/*
* If the new tuple is too big for storage or contains already toasted
* out-of-line attributes from some other relation, invoke the toaster.
+ *
+ * The toaster is invoked only if the tuple length is greater than the
+ * compression limit. Note that compress_tuple_threshold must be less
+ * than or equal to toast_tuple_threshold, so it's enough to only test
+ * for compress_tuple_threshold.
*/
if (relation->rd_rel->relkind != RELKIND_RELATION &&
relation->rd_rel->relkind != RELKIND_MATVIEW)
@@ -2088,7 +2107,8 @@ heap_prepare_insert(Relation relation, HeapTuple tup, TransactionId xid,
Assert(!HeapTupleHasExternal(tup));
return tup;
}
- else if (HeapTupleHasExternal(tup) || tup->t_len > TOAST_TUPLE_THRESHOLD)
+ else if (HeapTupleHasExternal(tup) ||
+ tup->t_len > compress_tuple_threshold)
return toast_insert_or_update(relation, tup, NULL, options);
else
return tup;
@@ -3390,9 +3410,30 @@ l2:
need_toast = false;
}
else
+ {
+ int toast_tuple_threshold;
+ int compress_tuple_threshold;
+
+ /*
+ * Compute the compressibility threshold, which is the minimum between
+ * toast_tuple_target and compress_tuple_target for a relation.
+ */
+ toast_tuple_threshold = RelationGetToastTupleTarget(relation,
+ TOAST_TUPLE_TARGET);
+ compress_tuple_threshold = RelationGetCompressTupleTarget(relation,
+ COMPRESS_TUPLE_TARGET);
+ compress_tuple_threshold = Min(compress_tuple_threshold,
+ toast_tuple_threshold);
+
+ /*
+ * compress_tuple_threshold must be less than or equal to
+ * toast_tuple_threshold, so it is enough to only test
+ * compress_tuple_threshold here.
+ */
need_toast = (HeapTupleHasExternal(&oldtup) ||
HeapTupleHasExternal(newtup) ||
- newtup->t_len > TOAST_TUPLE_THRESHOLD);
+ newtup->t_len > compress_tuple_threshold);
+ }
pagefree = PageGetHeapFreeSpace(page);
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index bce4274362..a7d5a29d36 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -636,11 +636,29 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
Size len;
OffsetNumber newoff;
HeapTuple heaptup;
+ int toast_tuple_threshold;
+ int compress_tuple_threshold;
+
+ /*
+ * Compute the compressibility threshold, which is the minimum between
+ * toast_tuple_target and compress_tuple_target for a relation.
+ */
+ toast_tuple_threshold = RelationGetToastTupleTarget(state->rs_new_rel,
+ TOAST_TUPLE_TARGET);
+ compress_tuple_threshold = RelationGetCompressTupleTarget(state->rs_new_rel,
+ COMPRESS_TUPLE_TARGET);
+ compress_tuple_threshold = Min(compress_tuple_threshold,
+ toast_tuple_threshold);
/*
* If the new tuple is too big for storage or contains already toasted
* out-of-line attributes from some other relation, invoke the toaster.
*
+ * The toaster is invoked only if the tuple length is greater than the
+ * compression limit. Note that compress_tuple_threshold must be less
+ * than or equal to toast_tuple_threshold, so it's enough to only test
+ * for compress_tuple_threshold.
+ *
* Note: below this point, heaptup is the data we actually intend to store
* into the relation; tup is the caller's original untoasted data.
*/
@@ -650,7 +668,8 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
Assert(!HeapTupleHasExternal(tup));
heaptup = tup;
}
- else if (HeapTupleHasExternal(tup) || tup->t_len > TOAST_TUPLE_THRESHOLD)
+ else if (HeapTupleHasExternal(tup) ||
+ tup->t_len > compress_tuple_threshold)
{
int options = HEAP_INSERT_SKIP_FSM;
diff --git a/src/backend/access/heap/tuptoaster.c b/src/backend/access/heap/tuptoaster.c
index 74e957abb7..2bfbc4195a 100644
--- a/src/backend/access/heap/tuptoaster.c
+++ b/src/backend/access/heap/tuptoaster.c
@@ -549,6 +549,7 @@ toast_insert_or_update(Relation rel, HeapTuple newtup, HeapTuple oldtup,
bool has_nulls = false;
Size maxDataLen;
+ Size maxCompressLen;
Size hoff;
char toast_action[MaxHeapAttributeNumber];
@@ -731,12 +732,19 @@ toast_insert_or_update(Relation rel, HeapTuple newtup, HeapTuple oldtup,
/* now convert to a limit on the tuple data size */
maxDataLen = RelationGetToastTupleTarget(rel, TOAST_TUPLE_TARGET) - hoff;
+ /*
+ * Get the limit at which we should apply compression. This will be same as
+ * maxDataLen unless overridden by the user explicitly.
+ */
+ maxCompressLen = RelationGetCompressTupleTarget(rel, COMPRESS_TUPLE_TARGET) - hoff;
+ maxCompressLen = Min(maxCompressLen, maxDataLen);
+
/*
* Look for attributes with attstorage 'x' to compress. Also find large
* attributes with attstorage 'x' or 'e', and store them external.
*/
while (heap_compute_data_size(tupleDesc,
- toast_values, toast_isnull) > maxDataLen)
+ toast_values, toast_isnull) > maxCompressLen)
{
int biggest_attno = -1;
int32 biggest_size = MAXALIGN(TOAST_POINTER_SIZE);
@@ -881,7 +889,7 @@ toast_insert_or_update(Relation rel, HeapTuple newtup, HeapTuple oldtup,
* compression
*/
while (heap_compute_data_size(tupleDesc,
- toast_values, toast_isnull) > maxDataLen)
+ toast_values, toast_isnull) > maxCompressLen)
{
int biggest_attno = -1;
int32 biggest_size = MAXALIGN(TOAST_POINTER_SIZE);
diff --git a/src/include/access/tuptoaster.h b/src/include/access/tuptoaster.h
index 4bfefffbf3..0d6e7c43da 100644
--- a/src/include/access/tuptoaster.h
+++ b/src/include/access/tuptoaster.h
@@ -33,10 +33,19 @@
/ (tuplesPerPage))
/*
- * These symbols control toaster activation. If a tuple is larger than
- * TOAST_TUPLE_THRESHOLD, we will try to toast it down to no more than
- * TOAST_TUPLE_TARGET bytes through compressing compressible fields and
- * moving EXTENDED and EXTERNAL data out-of-line.
+ * These symbols control toaster and compression activation. If a tuple
+ * is larger than COMPRESS_TUPLE_TARGET, we will try to compress it first
+ * if its column is marked as MAIN or EXTENDED. If the compressed row is
+ * still greater than TOAST_TUPLE_TARGET, then we will try to toast it
+ * down to TOAST_TUPLE_TARGET bytes through compressing compressible
+ * fields and moving EXTENDED and EXTERNAL data out-of-line.
+ *
+ * By default, COMPRESS_TUPLE_TARGET and TOAST_TUPLE_TARGET have the same
+ * threshold values, meaning that compression will only be tried when toasting
+ * a tuple. It is possible to tune that behavior using the relation option
+ * COMPRESS_TUPLE_TARGET. If COMPRESS_TUPLE_TARGET is higher than
+ * TOAST_TUPLE_TARGET, then only TOAST_TUPLE_TARGET is used to decide if a
+ * tuple is toasted and compressed.
*
* The numbers need not be the same, though they currently are. It doesn't
* make sense for TARGET to exceed THRESHOLD, but it could be useful to make
@@ -56,6 +65,8 @@
#define TOAST_TUPLE_TARGET TOAST_TUPLE_THRESHOLD
+#define COMPRESS_TUPLE_TARGET TOAST_TUPLE_THRESHOLD
+
/*
* The code will also consider moving MAIN data out-of-line, but only as a
* last resort if the previous steps haven't reached the target tuple size.
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 54028515a7..9e17e2a599 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -263,6 +263,7 @@ typedef struct StdRdOptions
/* fraction of newly inserted tuples prior to trigger index cleanup */
float8 vacuum_cleanup_index_scale_factor;
int toast_tuple_target; /* target for tuple toasting */
+ int compress_tuple_target; /* target for tuple compression */
AutoVacOpts autovacuum; /* autovacuum-related options */
bool user_catalog_table; /* use as an additional catalog relation */
int parallel_workers; /* max number of parallel workers */
@@ -279,6 +280,14 @@ typedef struct StdRdOptions
((relation)->rd_options ? \
((StdRdOptions *) (relation)->rd_options)->toast_tuple_target : (defaulttarg))
+/*
+ * RelationGetCompressTupleTarget
+ * Returns the relation's compress_tuple_target. Note multiple eval of argument!
+ */
+#define RelationGetCompressTupleTarget(relation, defaulttarg) \
+ ((relation)->rd_options ? \
+ ((StdRdOptions *) (relation)->rd_options)->compress_tuple_target : (defaulttarg))
+
/*
* RelationGetFillFactor
* Returns the relation's fillfactor. Note multiple eval of argument!
diff --git a/src/test/regress/expected/strings.out b/src/test/regress/expected/strings.out
index 189bdffdca..66757ba8c4 100644
--- a/src/test/regress/expected/strings.out
+++ b/src/test/regress/expected/strings.out
@@ -1209,7 +1209,69 @@ select 0 = pg_relation_size('pg_toast.pg_toast_'||(select oid from pg_class wher
t
(1 row)
+-- Test tuple compression with compress_tuple_target
+CREATE TABLE compresstest250 (a int, b text)
+ WITH (compress_tuple_target = 250);
+CREATE TABLE compresstest2040 (a int, b text)
+ WITH (compress_tuple_target = 2040);
+-- tuple should not get compressed nor toasted
+INSERT INTO compresstest250 VALUES (1, repeat('1234567890',20));
+-- tuple should get compressed but not toasted
+INSERT INTO compresstest250 VALUES (2, repeat('1234567890',30));
+-- tuple should not get compressed nor toasted
+INSERT INTO compresstest2040 VALUES (1, repeat('1234567890',20));
+-- tuple should not get compressed nor toasted
+INSERT INTO compresstest2040 VALUES (2, repeat('1234567890',30));
+SELECT a, pg_column_size(b) < 250 FROM compresstest250 ORDER BY a;
+ a | ?column?
+---+----------
+ 1 | t
+ 2 | t
+(2 rows)
+
+SELECT a, pg_column_size(b) < 250 FROM compresstest2040 ORDER BY a;
+ a | ?column?
+---+----------
+ 1 | t
+ 2 | f
+(2 rows)
+
+-- expect 0 blocks in toast relation
+SELECT pg_relation_size(reltoastrelid) = 0 AS data_size
+ FROM pg_class WHERE relname = 'compresstest250';
+ data_size
+-----------
+ t
+(1 row)
+
+SELECT pg_relation_size(reltoastrelid) = 0 AS data_size
+ FROM pg_class WHERE relname = 'compresstest2040';
+ data_size
+-----------
+ t
+(1 row)
+
+-- tuple should get compressed and toasted
+INSERT INTO compresstest250 SELECT 3, string_agg('', md5(random()::text))
+ FROM generate_series(1,300);
+SELECT a, pg_column_size(b) < 250 FROM compresstest250 ORDER by a;
+ a | ?column?
+---+----------
+ 1 | t
+ 2 | t
+ 3 | f
+(3 rows)
+
+-- expect > 0 blocks in toast relation
+SELECT pg_relation_size(reltoastrelid) = 0 AS data_size
+ FROM pg_class WHERE relname = 'compresstest250';
+ data_size
+-----------
+ f
+(1 row)
+
DROP TABLE toasttest;
+DROP TABLE compresstest250, compresstest2040;
--
-- test substr with toasted bytea values
--
diff --git a/src/test/regress/sql/strings.sql b/src/test/regress/sql/strings.sql
index f2203ef1b1..e0ce45dd68 100644
--- a/src/test/regress/sql/strings.sql
+++ b/src/test/regress/sql/strings.sql
@@ -386,7 +386,36 @@ INSERT INTO toasttest values (repeat('1234567890',300));
-- expect 0 blocks
select 0 = pg_relation_size('pg_toast.pg_toast_'||(select oid from pg_class where relname = 'toasttest'))/current_setting('block_size')::integer as blocks;
+-- Test tuple compression with compress_tuple_target
+CREATE TABLE compresstest250 (a int, b text)
+ WITH (compress_tuple_target = 250);
+CREATE TABLE compresstest2040 (a int, b text)
+ WITH (compress_tuple_target = 2040);
+-- tuple should not get compressed nor toasted
+INSERT INTO compresstest250 VALUES (1, repeat('1234567890',20));
+-- tuple should get compressed but not toasted
+INSERT INTO compresstest250 VALUES (2, repeat('1234567890',30));
+-- tuple should not get compressed nor toasted
+INSERT INTO compresstest2040 VALUES (1, repeat('1234567890',20));
+-- tuple should not get compressed nor toasted
+INSERT INTO compresstest2040 VALUES (2, repeat('1234567890',30));
+SELECT a, pg_column_size(b) < 250 FROM compresstest250 ORDER BY a;
+SELECT a, pg_column_size(b) < 250 FROM compresstest2040 ORDER BY a;
+-- expect 0 blocks in toast relation
+SELECT pg_relation_size(reltoastrelid) = 0 AS data_size
+ FROM pg_class WHERE relname = 'compresstest250';
+SELECT pg_relation_size(reltoastrelid) = 0 AS data_size
+ FROM pg_class WHERE relname = 'compresstest2040';
+-- tuple should get compressed and toasted
+INSERT INTO compresstest250 SELECT 3, string_agg('', md5(random()::text))
+ FROM generate_series(1,300);
+SELECT a, pg_column_size(b) < 250 FROM compresstest250 ORDER by a;
+-- expect > 0 blocks in toast relation
+SELECT pg_relation_size(reltoastrelid) = 0 AS data_size
+ FROM pg_class WHERE relname = 'compresstest250';
+
DROP TABLE toasttest;
+DROP TABLE compresstest250, compresstest2040;
--
-- test substr with toasted bytea values
signature.asc
Description: PGP signature
