Re: Clear empty space in a page.

2021-05-31 Thread Yura Sokolov

Hi,

Andres Freund wrote 2021-05-31 00:07:

Hi,

On 2021-05-30 03:10:26 +0300, Yura Sokolov wrote:
While this result is not directly applied to stock PostgreSQL, I 
believe
page compression is important for full_page_writes with 
wal_compression

enabled. And probably when PostgreSQL is used on filesystem with
compression enabled (ZFS?).


I don't think the former is relevant, because the hole is skipped in 
wal page

compression (at some cost).


Ah, forgot about. Yep, you are right.


Therefore I propose clearing page's empty space with zero in
PageRepairFragmentation, PageIndexMultiDelete, PageIndexTupleDelete 
and

PageIndexTupleDeleteNoCompact.

Sorry, didn't measure impact on raw performance yet.


I'm worried that this might cause O(n^2) behaviour in some cases, by
repeatedly memset'ing the same mostly already zeroed space to 0. Why do 
we
ever need to do memset_hole() instead of accurately just zeroing out 
the space

that was just vacated?


It is done exactly this way: memset_hole accepts "old_pd_upper" and 
cleans between

old and new one.

regards,
Yura




Re: Clear empty space in a page.

2021-05-30 Thread Andres Freund
Hi,

On 2021-05-30 03:10:26 +0300, Yura Sokolov wrote:
> While this result is not directly applied to stock PostgreSQL, I believe
> page compression is important for full_page_writes with wal_compression
> enabled. And probably when PostgreSQL is used on filesystem with
> compression enabled (ZFS?).

I don't think the former is relevant, because the hole is skipped in wal page
compression (at some cost).


> Therefore I propose clearing page's empty space with zero in
> PageRepairFragmentation, PageIndexMultiDelete, PageIndexTupleDelete and
> PageIndexTupleDeleteNoCompact.
> 
> Sorry, didn't measure impact on raw performance yet.

I'm worried that this might cause O(n^2) behaviour in some cases, by
repeatedly memset'ing the same mostly already zeroed space to 0. Why do we
ever need to do memset_hole() instead of accurately just zeroing out the space
that was just vacated?

Greetings,

Andres Freund




Re: Clear empty space in a page.

2021-05-30 Thread Omar Kilani
Hi,

I happened to be running some postgres on zfs on Linux/aarch64 tests
and tested this patch.

Kernel: 4.18.0-305.el8.aarch64
CPU: 16x3.0GHz Ampere Alta / Arm Neoverse N1 cores

ZFS: 2.1.0-rc6
ZFS options: options spl spl_kmem_cache_slab_limit=65536 (see:
https://github.com/openzfs/zfs/issues/12150)

Postgres: 13.3 with and without the patch
Postgres config:

full_page_writes = on
wal_compression = on

Without patch:

starting vacuum...end.
transaction type: 
scaling factor: 100
query mode: prepared
number of clients: 32
number of threads: 32
duration: 43200 s
number of transactions actually processed: 612557228
latency average = 2.257 ms
tps = 14179.551402 (including connections establishing)
tps = 14179.553286 (excluding connections establishing)

With patch:

starting vacuum...end.
transaction type: 
scaling factor: 100
query mode: prepared
number of clients: 32
number of threads: 32
duration: 43200 s
number of transactions actually processed: 606967295
latency average = 2.278 ms
tps = 14050.164370 (including connections establishing)
tps = 14050.166007 (excluding connections establishing)

It does seem to help with on disk compression but it *might* have
caused more fragmentation.

Regards,
Omar

On Sat, May 29, 2021 at 10:22 PM Fabien COELHO  wrote:
>
>
> Hello Yura,
>
> > didn't measure impact on raw performance yet.
>
> Must be done. There c/should be a guc to control this behavior if the
> performance impact is noticeable.
>
> --
> Fabien.
>
>




Re: Clear empty space in a page.

2021-05-29 Thread Fabien COELHO



Hello Yura,


didn't measure impact on raw performance yet.


Must be done. There c/should be a guc to control this behavior if the 
performance impact is noticeable.


--
Fabien.




Clear empty space in a page.

2021-05-29 Thread Yura Sokolov

Good day.

Long time ago I've been played with proprietary "compressed storage"
patch on heavily updated table, and found empty pages (ie cleaned by
vacuum) are not compressed enough.

When table is stress-updated, page for new row versions are allocated
in round-robin kind, therefore some 1GB segments contains almost
no live tuples. Vacuum removes dead tuples, but segments remains large
after compression (>400MB) as if they are still full.

After some investigation I found it is because PageRepairFragmentation,
PageIndex*Delete* don't clear space that just became empty therefore it
still contains garbage data. Clearing it with memset greatly increase
compression ratio: some compressed relation segments become 30-60MB just
after vacuum remove tuples in them.

While this result is not directly applied to stock PostgreSQL, I believe
page compression is important for full_page_writes with wal_compression
enabled. And probably when PostgreSQL is used on filesystem with
compression enabled (ZFS?).

Therefore I propose clearing page's empty space with zero in
PageRepairFragmentation, PageIndexMultiDelete, PageIndexTupleDelete and
PageIndexTupleDeleteNoCompact.

Sorry, didn't measure impact on raw performance yet.

regards,
Yura Sokolov aka funny_falconcommit 6abfcaeb87fcb396c5e2dccd434ce2511314ff76
Author: Yura Sokolov 
Date:   Sun May 30 02:39:17 2021 +0300

Clear empty space in a page

Write zeroes to just cleared space in PageRepairFragmentation,
PageIndexTupleDelete, PageIndexMultiDelete and PageIndexDeleteNoCompact.

It helps increase compression ration on compression enabled filesystems
and with full_page_write and wal_compression enabled.

diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 82ca91f5977..7deb6cc71a4 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -681,6 +681,17 @@ compactify_tuples(itemIdCompact itemidbase, int nitems, Page page, bool presorte
 	phdr->pd_upper = upper;
 }
 
+/*
+ * Clean up space between pd_lower and pd_upper for better page compression.
+ */
+static void
+memset_hole(Page page, LocationIndex old_pd_upper)
+{
+	PageHeader	phdr = (PageHeader) page;
+	if (phdr->pd_upper > old_pd_upper)
+		MemSet((char *)page + old_pd_upper, 0, phdr->pd_upper - old_pd_upper);
+}
+
 /*
  * PageRepairFragmentation
  *
@@ -797,6 +808,7 @@ PageRepairFragmentation(Page page)
 
 		compactify_tuples(itemidbase, nstorage, page, presorted);
 	}
+	memset_hole(page, pd_upper);
 
 	/* Set hint bit for PageAddItemExtended */
 	if (nunused > 0)
@@ -1114,6 +1126,7 @@ PageIndexTupleDelete(Page page, OffsetNumber offnum)
 
 	if (offset > phdr->pd_upper)
 		memmove(addr + size, addr, offset - phdr->pd_upper);
+	MemSet(addr, 0, size);
 
 	/* adjust free space boundary pointers */
 	phdr->pd_upper += size;
@@ -1271,6 +1284,7 @@ PageIndexMultiDelete(Page page, OffsetNumber *itemnos, int nitems)
 		compactify_tuples(itemidbase, nused, page, presorted);
 	else
 		phdr->pd_upper = pd_special;
+	memset_hole(page, pd_upper);
 }
 
 
@@ -1351,6 +1365,7 @@ PageIndexTupleDeleteNoCompact(Page page, OffsetNumber offnum)
 
 	if (offset > phdr->pd_upper)
 		memmove(addr + size, addr, offset - phdr->pd_upper);
+	MemSet(addr, 0, size);
 
 	/* adjust free space boundary pointer */
 	phdr->pd_upper += size;