On 2011-01-17, carlo <syseng...@gmail.com> wrote: > Is it true UTF-8 does not have any "big-endian/little-endian" issue > because of its encoding method? And if it is true, why Mark (and > everyone does) writes about UTF-8 with and without BOM some chapters > later? What would be the BOM purpose then?
Yes, it is true. The BOM simply identifies that the encoding as a UTF-8.: http://unicode.org/faq/utf_bom.html#bom5 > 2- If that were true, can you point me to some documentation about the > math that, as Mark says, demonstrates this? It is true because UTF-8 is essentially an 8 bit encoding that resorts to the next bit once it exhausts the addressible space of the current byte it moves to the next one. Since the bytes are accessed and assessed sequentially, they must be in big-endian order. -- http://mail.python.org/mailman/listinfo/python-list