https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107006

            Bug ID: 107006
           Summary: Missing optimization: common idiom for external data
           Product: gcc
           Version: 12.2.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: hpa at zytor dot com
  Target Milestone: ---

Created attachment 53602
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53602&action=edit
C test case source

The only *portable* way in C to deal with external data structures containing
data of specific endianness, possibly unaligned, is to operate on them as byte
(char) arrays.

At least on x86 (which supports arbitrarily aligned loads), gcc *sometimes*
recognize these as single loads, but sometimes not.

In the included test cases, there is a plain C implementation and an
implementation wrapped in a C++ class.

Compiling the former with:

gcc -std=c2x -g -O3 -W -Wall -[cSE] -o bswap.[osi] bswap.c

... recognizes the load idiom for 16-bit numbers but not for 32- or 64-bit
numbers.

Compiling the latter with:

gcc -std=c++20 -g -O3 -E -Wall -[cSE] -o bswapcc.[osi] bswapcc.cc

... *additionally* recognizes the 32-bit load, *but only in the bigendian case*
(that is, it generates a load and a bswap instruction); whereas in the
littleendian -- native -- case, this does not happen!

I am familiar with the used of packed arrays and __builtin_bswap*() for these
accesses, but unfortunately these are gcc-specific.

Reply via email to