gcc needs some built in functions for byte swapping. I've been experimenting with the various versions of byte swapping functions out there, and they either result in code that's opaque to the optimizer (i.e. swapping something twice is not considered a null operation) or the optimizer doesn't recognize that a byte swap is what's happening and renders it as a complex series of shift, and and or instructions.
I know very little about the internals of gcc, but my ignorant preference would be to make tree-ssa recognize that code like this: inline uint64_t byteswap_64(const uint64_t x) { return ((((x) & 0xff00000000000000ull) >> 56) | (((x) & 0x00ff000000000000ull) >> 40) | (((x) & 0x0000ff0000000000ull) >> 24) | (((x) & 0x000000ff00000000ull) >> 8) | (((x) & 0x00000000ff000000ull) << 8) | (((x) & 0x0000000000ff0000ull) << 24) | (((x) & 0x000000000000ff00ull) << 40) | (((x) & 0x00000000000000ffull) << 56)); } is a byte swap and optimize appropriately. If this were being done to an entire array, it might even be possible to vectorize it efficiently. This would also mean that code to pull specific bits out of a pre or post swap value could be moved around and fiddled to get the value out of a different place if it made for more efficient register usage. -- Summary: gcc needs byte swap builtins Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: eric-bugs at omnifarious dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40210