> On Oct 6, 2015, at 6:04 , Philippe Verdy <verd...@wanadoo.fr> wrote:
> 
> In those conditions, normalizing the Java string will leave those lone 
> surrogates (and non-characters) as is, or will throw an exception, depending 
> on the API used. Java strings do not have any implied encoding (their "char" 
> members are also unrestricted 16-bit code units, they have some basic 
> properties but only in BMP, defined in the builtin Character class API: 
> properties for non-BMP characters require using a library to provide them, 
> such as ICU4J).

The Java Character class was enhanced in J2SE 5.0 to support supplementary 
characters. The String class was specified to be based on UTF-16, and string 
processing throughout the platform was updated to support supplementary 
characters based on UTF-16. These changes have been available to the public 
since 2004. For a summary, see
http://www.oracle.com/technetwork/articles/java/supplementary-142654.html

Norbert

Reply via email to