Re: pglz compression performance, take two

Andrey Borodin Sun, 27 Jun 2021 03:42:00 -0700


> 20 марта 2021 г., в 00:35, Mark Dilger <mark.dil...@enterprisedb.com> 
> написал(а):
> 
> 
> 
>> On Jan 21, 2021, at 6:48 PM, Justin Pryzby <pry...@telsasoft.com> wrote:
>> 
>> @cfbot: rebased
>> <0001-Reorganize-pglz-compression-code.patch>
> 
> Review comments.


Thanks for the review, Mark!
And sorry for such a long delay, I've been trying to figure out a way to do 
things less-platform dependent.
And here's what I've come up with.

We use pglz_read32() not the way xxhash and lz4 does - we really do not need to 
get 4-byte value, we only need to compare 4 bytes at once.
So, essentially, we need to compare two implementation of 4-byte comparison

bool
cpm_a(const void *ptr1, const void *ptr2)
{
    return *(const uint32_t *) ptr1 == *(const uint32_t *) ptr2;
}

bool
cmp_b(const void *ptr1, const void *ptr2)
{
    return memcmp(ptr1, ptr2, 4) == 0;
}

Variant B is more portable. Inspecting it Godblot's compiler explorer I've 
found out that for GCC 7.1+ it generates assembly without memcmp() call. For 
x86-64 and ARM64 assembly of cmp_b is identical to cmp_a.
So I think maybe we could just stick with version cmp_b instead of optimising 
for ARM6 and similar architectures like Arduino.

I've benchmarked the patch with "REINDEX table pgbench_accounts" on pgbench -i 
of scale 100. wal_compression was on, other settings were default.
Without patch it takes ~11055.077 ms on my machine, with patch it takes 
~9512.411 ms, 14% speedup overall.

PFA v5.

Thanks!

Best regards, Andrey Borodin.

v5-0001-Reorganize-pglz-compression-code.patch
Description: Binary data

Re: pglz compression performance, take two

Reply via email to