> On 12 May 2015, at 15:45, Philippe Verdy <[email protected]> wrote:
> 
> 
> 
> 2015-05-11 23:53 GMT+02:00 Hans Aberg <[email protected]>:
>> It is perfectly fine considering the Unicode code points as abstract 
>> integers, with UTF-32 and UTF-8 encodings that translate them into byte 
>> sequences in a computer. The code points that conflict with UTF-16 might 
>> have been merely declared not in use until UTF-16 has been fallen out of 
>> use, replaced by UTF-8 and UTF-32.
>> 
> The deprecation of UTF-16 and UTF-32 as encoding *schemes* ("charsets" in 
> MIME) is already very advanced. 

UTF-32 is usable for internal use in programs.

> But they will certinaly not likely disappear as encoding *forms* for internal 
> use in binary APIs and in several very popular programming languages: Java, 
> Javascript, even C++ on Windows platforms (where it is the 8-bit interface, 
> based on legacy "code pages" and with poor support of the UTF-8 encoding 
> scheme as a Windows "code page", is the one that is now being phased out), 
> C#, J#…

That is legacy, which may remain for long. For example, C/C++ trigraphs are 
only removed now, since long just a bother for compiler implementation. Java is 
very old, designed around 32-bit programming with limits on function code size, 
which was a limitation in pre-PPC CPU that went out of use in the early 1990s.

> UTF-8 will also remain for long as the prefered internal encoding for Python, 
> PHP (even if Python introduced also a 16-bit native datatype).
> 
> In all cases, programming languages are not based on any Unicode encoding 
> forms but on more or less opaque streams of code units using datatypes that 
> are not constrained by Unicode (because their "character" or "byte" datatype 
> is also used for binary I/O and for supporting also the conversion of various 
> binary structures, including executable code, and also because even this 
> datatype is not necessarily 8-bit but may be larger and not even an even 
> multiple of 8-bits)

Indeed, that is why UTF-8 was invented for use in Unix-like environments.



Reply via email to