Re: Thoughts on memcmp expansion (PR43052)

Florian Weimer Mon, 30 May 2016 02:38:06 -0700

On 01/15/2016 05:58 PM, Bernd Schmidt wrote:

One question Richard posed in the comments: why aren't we optimizing
small constant size memcmps other than size 1 to *s == *q? The reason is
the return value of memcmp, which implies byte-sized operation
(incidentally, the use of SImode in the cmpmem/cmpstr patterns is really
odd). It's possible to work around this, but expansion becomes a little
more tricky (subtract after bswap, maybe).

When I did this (big-endian conversion, wide substract, sign) to thetail difference check in glibc's x86_64 memcmp, it was actually a bitfaster than isolating the differing byte and returning its difference,even for non-random data such as encountered during qsort:

 * Expand __memcmp_eq for small constant sizes with loads and
   comparison, fall back to a memcmp call.

Should we export such a function from glibc? I expect it's fairlycommon. Computing the tail difference costs a few cycles.

It may also make sense to call a streamlined implementation if you haveinteresting alignment information (for x86_64, that would be at least 16on one or both inputs, so it's perhaps not easy to come by).


Florian

Re: Thoughts on memcmp expansion (PR43052)

Reply via email to