On Thursday, 2 June 2016 at 21:00:17 UTC, tsbockman wrote:
However, this document is very old - from Unicode 3.0 and the year 2000:

While there are no surrogate characters in Unicode 3.0 (outside of private use characters), future versions of Unicode will contain them...

Perhaps level 1 has since been redefined?

I found the latest (unofficial) draft version:
    http://www.unicode.org/reports/tr18/tr18-18.html

Relevant changes:

* Level 1 is to be redefined as working on code points, not code units:

A fundamental requirement is that Unicode text be interpreted semantically by code point, not code units.

* Level 2 (graphemes) is explicitly described as a "default level":

This is still a default level—independent of country or language—but provides much better support for end-user expectations than the raw level 1...

* All mention of level 2 being slow has been removed. The only reason given for making it toggle-able is for compatibility with level 1 algorithms:

Level 2 support matches much more what user expectations are for sequences of Unicode characters. It is still locale-independent and easily implementable. However, for compatibility with Level 1, it is useful to have some sort of syntax that will turn Level 2 support on and off.

Reply via email to