Re: pglz compression performance, take two

Andrey Borodin Thu, 04 Nov 2021 22:51:19 -0700

Thanks for the review Mark! Sorry it took too long to reply on my side.

> 28 июня 2021 г., в 21:05, Mark Dilger <mark.dil...@enterprisedb.com> 
> написал(а):
> 
>> #define PGLZ_HISTORY_SIZE       0x0fff - 1  /* to avoid compare in iteration 
>> */
> ...
>> static PGLZ_HistEntry hist_entries[PGLZ_HISTORY_SIZE + 1];
> ...
>>    if (hist_next == PGLZ_HISTORY_SIZE + 1)
> 
> These are the only uses of PGLZ_HISTORY_SIZE.  Perhaps you could just defined 
> the symbol as 0x0fff and skip the -1 and +1 business?
Fixed.


>> /* ----------
>> * pglz_compare -
>> *
>> *      Compares 4 bytes at pointers
>> * ----------
>> */
>> static inline bool
>> pglz_compare32(const void *ptr1, const void *ptr2)
>> {
>>    return memcmp(ptr1, ptr2, 4) == 0;
>> }
> 
> The comment function name differs from the actual function name.
> 
> Also, pglz_compare returns an offset into the string, whereas pglz_compare32 
> returns a boolean.  This is fairly unintuitive.  The "32" part of 
> pglz_compare32 implies doing the same thing as pglz_compare but where the 
> string is known to be 4 bytes in length.  Given that pglz_compare32 is 
> dissimilar to pglz_compare, perhaps avoid using /pglz_compare/ in its name?
I've removed pglz_compare32 entirely. It was a simple substitution for memcmp().

> 
>>        /*
>>         * Determine length of match. A better match must be larger than the
>>         * best so far. And if we already have a match of 16 or more bytes,
>>         * it's worth the call overhead to use memcmp()
> 
> This comment is hard to understand, given the code that follows.  The first 
> block calls memcmp(), which seems to be the function overhead you refer to.  
> The second block calls the static inline function pglz_compare32, which 
> internally calls memcmp().  Superficially, there seems to be a memcmp() 
> function call either way.  The difference is that in the first block's call 
> to memcmp(), the length is a runtime value, and in the second block, it is a 
> compile-time known value.  If you are depending on the compiler to notice 
> this distinction and optimize the second call, perhaps you can mention that 
> explicitly?  Otherwise, reading and understanding the comment takes more 
> effort.
I've updated comment for second branch with fixed-size memcmp(). Frankly, I'm 
not sure "if (memcmp(input_pos, hist_pos, 4) == 0)" worth the complexity,  
internals of "pglz_compare(0, len_bound, input_pos + 0, hist_pos + 0);" would 
do almost same instructions.

> 
> I took a quick look for other places in the code that try to beat the 
> performance of memcmp on short strings.  In varlena.c, rest_of_char_same() 
> seems to do so.  We also use comparisons on NameData, which frequently 
> contains strings shorter than 16 bytes.  Is it worth sharting a static inline 
> function that uses your optimization in other places?  How confident are you 
> that your optimization really helps?
Honestly, I do not know. The overall patch effect consists of stacking up many 
small optimizations. They have a net effect, but are too noisy to measure 
independently. That's mostly the reason why I didn't know what to reply for so 
long.


> 5 нояб. 2021 г., в 01:47, Tomas Vondra <tomas.von...@enterprisedb.com> 
> написал(а):
> 
> Andrey, can you update the patch per Mark's review? I'll do my best to get it 
> committed sometime in this CF.

Cool! Here's the patch.

Best regards, Andrey Borodin.

v6-0001-Reorganize-pglz-compression-code.patch
Description: Binary data

Re: pglz compression performance, take two

Reply via email to