> On Apr 9, 2019, at 10:09 AM, Andrey Borodin <x4...@yandex-team.ru> wrote: > > He advised me to use algorithm that splits copied regions into smaller > non-overlapping subregions with exponentially increasing size. > > while (off <= len) > { > memcpy(dp, dp - off, off); > len -= off; > dp += off; > off *= 2; > } > memcpy(dp, dp - off, len); > > On original Paul's test without patch of this thread this optimization gave > about x2.5 speedup. > I've composed more detailed tests[0] and tested against current master. Now > it only gives 20%-25% of decompression speedup, but I think it is still > useful.
Wow, well beyond slicing, just being able to decompress 25% faster is a win for pretty much any TOAST use case. I guess the $100 question is: portability? The whole reason for the old-skool code that’s there now was concerns about memcpy’ing overlapping addresses and Bad Things happening. P.