Nope, byte order never applies to UTF-8. The é character would always appear as C3 A9 in the data, regardless of the byte order. Also note that the dfdl:byteOrder property does not apply for encodings like UTF-16BE, UTF-32LE. The byteOrder is defined by the character encoding and so dfdl:byteOrder is ignored.
- Steve On 2/4/19 2:50 PM, Costello, Roger L. wrote: > Hello DFDL community, > > As Steve explained a while back, endian-ness applies to multi-byte words. > > Endian-ness does not apply to ASCII characters because each character is a > single byte. > > Endian-ness does apply to UTF-16BE (Big-Endian), UTF-16LE (Little-Endian), > UTF-32BE and UTF32-LE because each character uses multiple bytes. > > Clearly endian-ness does not apply to single-byte UTF-8 characters. But what > about UTF-8 characters that use multiple bytes, such as the character é, > which uses two bytes C3 and A9; does endian-ness apply? For example, if a > file is in Little Endian would the character é appear in a hex editor as A9 > C3 whereas if the file is in Big Endian the character é would appear in a hex > editor as C3 A9? > > /Roger >
