On Wed, Sep 4, 2019 at 12:22 PM Andrey Borodin <x4...@yandex-team.ru> wrote: > > Hi, Peter! Thanks for looking into this. > > > 4 сент. 2019 г., в 14:09, Peter Eisentraut > > <peter.eisentr...@2ndquadrant.com> написал(а): > > > > On 2019-06-24 10:44, Andrey Borodin wrote: > >>> 18 мая 2019 г., в 11:44, Andrey Borodin <x4...@yandex-team.ru> написал(а): > >>> > >> Hi! > >> Here's rebased version of patches. > >> > >> Best regards, Andrey Borodin. > > > > I think this is the most recent patch for the CF entry > > <https://commitfest.postgresql.org/24/2119/>. > > > > What about the two patches? Which one is better? > On our observations pglz_decompress_hacked.patch is best for most of tested > platforms. > Difference is that pglz_decompress_hacked8.patch will not appply optimization > if decompressed match is not greater than 8 bytes. This optimization was > suggested by Tom, that's why we benchmarked it specifically. > > > Have you also considered using memmove() to deal with the overlap issue? > Yes, memmove() resolves ambiguity of copying overlapping regions in a way > that is not compatible with pglz. In proposed patch we never copy overlapping > regions. > > > Benchmarks have been posted in this thread. Where is the benchmarking > > tool? Should we include that in the source somehow? > > Benchmarking tool is here [0]. Well, code of the benchmarking tool do not > adhere to our standards in some places, we did not consider its inclusion in > core. > However, most questionable part of benchmarking is choice of test data. It's > about 100Mb of useless WALs, datafile and valuable Shakespeare writings.
Why not use 'Silesia compression corpus' (http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia), which used by lzbench (https://github.com/inikep/lzbench) ? I and Teodor remember that testing on non-english texts could be very important. > > Best regards, Andrey Borodin. > > > [0] https://github.com/x4m/test_pglz > > > -- Postgres Professional: http://www.postgrespro.com The Russian Postgres Company