Marc-Andre Lemburg added the comment: On 08.10.2013 11:03, Antoine Pitrou wrote: > >>> utf-16 isn't that widely used, so it's probably fine if it becomes >>> a bit slower. >> >> It's the default encoding for Unicode text files and APIs on Windows, >> so I'd say it *is* widely used :-) > > I've never seen any UTF-16 text files. Do you have other data?
See the link I posted. MS Notepad and MS Office save Unicode text files in UTF-16-LE, unless you explicitly specify UTF-8, just like many other Windows applications that support Unicode text files: http://msdn.microsoft.com/en-us/library/windows/desktop/dd374101%28v=vs.85%29.aspx http://superuser.com/questions/294219/what-are-the-differences-between-linux-and-windows-txt-files-unicode-encoding This is simply due to the fact that MS introduced Unicode plain text files as UTF-16-LE files and only later added the possibility to also use UTF-8 with BOM versions. > APIs are irrelevant. You only pass very small strings to then (e.g. > file paths). You are forgetting that wchar_t is UTF-16 on Windows, so UTF-16 is all around you when working on Windows, not only in the OS APIs, but also in most other Unicode APIs you find on Windows: http://msdn.microsoft.com/en-us/library/windows/desktop/dd374089%28v=vs.85%29.aspx http://msdn.microsoft.com/en-us/library/windows/desktop/dd374061%28v=vs.85%29.aspx ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue12892> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com