On Fri, 13 May 2016, Bernd Schmidt wrote: > On 05/13/2016 03:07 PM, Richard Biener wrote: > > On Fri, May 13, 2016 at 3:05 PM, Bernd Schmidt <bschm...@redhat.com> wrote: > > > Huh? Can you elaborate? > > > > When you have a builtin taking a size in bytes then a byte is 8 bits, > > not BITS_PER_UNIT bits. > > That makes no sense to me. I think the definition of a byte depends on the > machine (hence the term "octet" was coined to be unambiguous). Also, such a > definition would seem to imply that machines with 10-bit bytes cannot > implement memcpy or memcmp. > > Joseph, can you clarify the standard's meaning here?
* In C: a byte is the minimal addressable unit; an N-byte object is made up of N "unsigned char" objects, with successive addresses in terms of incrementing an "unsigned char *" pointer. A byte is at least 8 bits. * In GCC, at the level of GNU C APIs on the target, which generally includes built-in functions: a byte (on the target) is made of CHAR_TYPE_SIZE bits. In theory this could be more than BITS_PER_UNIT, or that could be more than 8, though support for either of those cases would be very bit-rotten (and I'm not sure there ever have been targets with CHAR_TYPE_SIZE > BITS_PER_UNIT). Sizes passed to memcpy and memcmp are sizes in units of CHAR_TYPE_SIZE bits. * In GCC, at the RTL level: a byte (on the target) is a QImode object, which is made of BITS_PER_UNIT bits. (HImode is always two bytes, SImode four, etc., if those modes exist.) Support for BITS_PER_UNIT being more than 8 is very bit-rotten. * In GCC, on the host: GCC only supports hosts (and $build) where bytes are 8-bit (though writing it as CHAR_BIT makes it clear that this 8 means the number of bits in a host byte). Internal interfaces e.g. representing the contents of strings or other memory on the target may not currently be well-defined except when BITS_PER_UNIT is 8. Cf. e.g. <https://gcc.gnu.org/ml/gcc/2003-06/msg01159.html>. But the above should at least give guidance as to whether BITS_PER_UNIT, CHAR_TYPE_SIZE (or TYPE_PRECISION (char_type_node), preferred where possible to minimize usage of target macros) or CHAR_BIT is logically right in a particular place. -- Joseph S. Myers jos...@codesourcery.com