Thanks for the review Mark! Sorry it took too long to reply on my side. > 28 июня 2021 г., в 21:05, Mark Dilger <[email protected]> > написал(а): > >> #define PGLZ_HISTORY_SIZE 0x0fff - 1 /* to avoid compare in iteration >> */ > ... >> static PGLZ_HistEntry hist_entries[PGLZ_HISTORY_SIZE + 1]; > ... >> if (hist_next == PGLZ_HISTORY_SIZE + 1) > > These are the only uses of PGLZ_HISTORY_SIZE. Perhaps you could just defined > the symbol as 0x0fff and skip the -1 and +1 business? Fixed.
>> /* ----------
>> * pglz_compare -
>> *
>> * Compares 4 bytes at pointers
>> * ----------
>> */
>> static inline bool
>> pglz_compare32(const void *ptr1, const void *ptr2)
>> {
>> return memcmp(ptr1, ptr2, 4) == 0;
>> }
>
> The comment function name differs from the actual function name.
>
> Also, pglz_compare returns an offset into the string, whereas pglz_compare32
> returns a boolean. This is fairly unintuitive. The "32" part of
> pglz_compare32 implies doing the same thing as pglz_compare but where the
> string is known to be 4 bytes in length. Given that pglz_compare32 is
> dissimilar to pglz_compare, perhaps avoid using /pglz_compare/ in its name?
I've removed pglz_compare32 entirely. It was a simple substitution for memcmp().
>
>> /*
>> * Determine length of match. A better match must be larger than the
>> * best so far. And if we already have a match of 16 or more bytes,
>> * it's worth the call overhead to use memcmp()
>
> This comment is hard to understand, given the code that follows. The first
> block calls memcmp(), which seems to be the function overhead you refer to.
> The second block calls the static inline function pglz_compare32, which
> internally calls memcmp(). Superficially, there seems to be a memcmp()
> function call either way. The difference is that in the first block's call
> to memcmp(), the length is a runtime value, and in the second block, it is a
> compile-time known value. If you are depending on the compiler to notice
> this distinction and optimize the second call, perhaps you can mention that
> explicitly? Otherwise, reading and understanding the comment takes more
> effort.
I've updated comment for second branch with fixed-size memcmp(). Frankly, I'm
not sure "if (memcmp(input_pos, hist_pos, 4) == 0)" worth the complexity,
internals of "pglz_compare(0, len_bound, input_pos + 0, hist_pos + 0);" would
do almost same instructions.
>
> I took a quick look for other places in the code that try to beat the
> performance of memcmp on short strings. In varlena.c, rest_of_char_same()
> seems to do so. We also use comparisons on NameData, which frequently
> contains strings shorter than 16 bytes. Is it worth sharting a static inline
> function that uses your optimization in other places? How confident are you
> that your optimization really helps?
Honestly, I do not know. The overall patch effect consists of stacking up many
small optimizations. They have a net effect, but are too noisy to measure
independently. That's mostly the reason why I didn't know what to reply for so
long.
> 5 нояб. 2021 г., в 01:47, Tomas Vondra <[email protected]>
> написал(а):
>
> Andrey, can you update the patch per Mark's review? I'll do my best to get it
> committed sometime in this CF.
Cool! Here's the patch.
Best regards, Andrey Borodin.
v6-0001-Reorganize-pglz-compression-code.patch
Description: Binary data
