> 20 марта 2021 г., в 00:35, Mark Dilger <mark.dil...@enterprisedb.com> > написал(а): > > > >> On Jan 21, 2021, at 6:48 PM, Justin Pryzby <pry...@telsasoft.com> wrote: >> >> @cfbot: rebased >> <0001-Reorganize-pglz-compression-code.patch> > > Review comments.
Thanks for the review, Mark! And sorry for such a long delay, I've been trying to figure out a way to do things less-platform dependent. And here's what I've come up with. We use pglz_read32() not the way xxhash and lz4 does - we really do not need to get 4-byte value, we only need to compare 4 bytes at once. So, essentially, we need to compare two implementation of 4-byte comparison bool cpm_a(const void *ptr1, const void *ptr2) { return *(const uint32_t *) ptr1 == *(const uint32_t *) ptr2; } bool cmp_b(const void *ptr1, const void *ptr2) { return memcmp(ptr1, ptr2, 4) == 0; } Variant B is more portable. Inspecting it Godblot's compiler explorer I've found out that for GCC 7.1+ it generates assembly without memcmp() call. For x86-64 and ARM64 assembly of cmp_b is identical to cmp_a. So I think maybe we could just stick with version cmp_b instead of optimising for ARM6 and similar architectures like Arduino. I've benchmarked the patch with "REINDEX table pgbench_accounts" on pgbench -i of scale 100. wal_compression was on, other settings were default. Without patch it takes ~11055.077 ms on my machine, with patch it takes ~9512.411 ms, 14% speedup overall. PFA v5. Thanks! Best regards, Andrey Borodin.
v5-0001-Reorganize-pglz-compression-code.patch
Description: Binary data