> On 12 May 2015, at 15:45, Philippe Verdy <[email protected]> wrote: > > > > 2015-05-11 23:53 GMT+02:00 Hans Aberg <[email protected]>: >> It is perfectly fine considering the Unicode code points as abstract >> integers, with UTF-32 and UTF-8 encodings that translate them into byte >> sequences in a computer. The code points that conflict with UTF-16 might >> have been merely declared not in use until UTF-16 has been fallen out of >> use, replaced by UTF-8 and UTF-32. >> > The deprecation of UTF-16 and UTF-32 as encoding *schemes* ("charsets" in > MIME) is already very advanced.
UTF-32 is usable for internal use in programs. > But they will certinaly not likely disappear as encoding *forms* for internal > use in binary APIs and in several very popular programming languages: Java, > Javascript, even C++ on Windows platforms (where it is the 8-bit interface, > based on legacy "code pages" and with poor support of the UTF-8 encoding > scheme as a Windows "code page", is the one that is now being phased out), > C#, J#… That is legacy, which may remain for long. For example, C/C++ trigraphs are only removed now, since long just a bother for compiler implementation. Java is very old, designed around 32-bit programming with limits on function code size, which was a limitation in pre-PPC CPU that went out of use in the early 1990s. > UTF-8 will also remain for long as the prefered internal encoding for Python, > PHP (even if Python introduced also a 16-bit native datatype). > > In all cases, programming languages are not based on any Unicode encoding > forms but on more or less opaque streams of code units using datatypes that > are not constrained by Unicode (because their "character" or "byte" datatype > is also used for binary I/O and for supporting also the conversion of various > binary structures, including executable code, and also because even this > datatype is not necessarily 8-bit but may be larger and not even an even > multiple of 8-bits) Indeed, that is why UTF-8 was invented for use in Unix-like environments.

