Re: [Python-Dev] Unicode 8.0 and 3.5

MRAB Thu, 18 Jun 2015 17:57:07 -0700

On 2015-06-19 00:56, Steven D'Aprano wrote:

On Thu, Jun 18, 2015 at 08:34:14PM +0100, MRAB wrote:

On 2015-06-18 19:33, Larry Hastings wrote:
>On 06/18/2015 11:27 AM, Terry Reedy wrote:
>>Unicode 8.0 was just released.  Can we have unicodedata updated to
>>match in 3.5?
>>
>
>What does this entail?  Data changes, code changes, both?
>
It looks like just data changes.


At the very least, there is a change to the casefolding algorithm.
Cherokee was classified as unicameral but is now considered bicameral
(two cases, like English). Unusually, case-folding Cherokee maps to
uppercase rather than lowercase.

Doesn't the case-folding just depend on the data and the algorithm
remains the same?

The full set of changes is listed here:

http://unicode.org/versions/Unicode8.0.0/

Apart from the addition of 7716 characters and changes to
str.casefold(), I don't think any of the changes will make a big
difference to Python's implementation. But it would be good to support
Unicode 8 (to the degree that Python actually does support Unicode,
rather than just that character set part of it).

There are additional codepoints and a renamed property (which the
standard library doesn't support anyway).


Which one are you referring to, Indic_Matra_Category renamed to
Indic_Positional_Category?

Yes.

_______________________________________________
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode 8.0 and 3.5

Reply via email to