On Tue, Aug 8, 2017 at 3:20 AM, Cameron Simpson <c...@cskk.id.au> wrote: > > As you note, the 16 and 32 forms are (6 + 1) times 2 or 4 respectively. This > is because each encoding has a leading byte order marker to indicate the big > endianness or little endianness. For big endian data that is \xff\xfe; for > little endian data it would be \xfe\xff.
To avoid encoding a byte order mark (BOM), use an "le" or "be" suffix, e.g. >>> 'Hello!'.encode('utf-16le') b'H\x00e\x00l\x00l\x00o\x00!\x00' Sometimes a data format includes the byte order, which makes using a BOM redundant. For example, strings in the Windows registry use UTF-16LE, without a BOM. _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor