[issue4610] Unicode case mappings are incorrect

Jeff Senn Tue, 13 Oct 2009 13:04:18 -0700

Jeff Senn <[email protected]> added the comment:

Has there been any action on this? a PEP?


I disagree that using ICU is good way to simply get proper
unicode casing. (A heavy hammer for a small task...)

I agree locales are a different issue (and would prefer
optional arguments to the unicode object casing methods -- 
that could then be used within any future sort of locale object 
to handle correct casing -- but don't rely on such.)

Most of the special casing rules can be accomplished by 
a decomposition (or recursive decomposition) on the character
followed by casing the result -- so NO new table is necessary
-- only marking up the characters so implicated (there are
extra unused bits in the char type table that could be used 
for this purpose -- so no additional space needed there either).  

What remains are a tiny handful of cases that need to be handled
in code.

I have a half finished implementation of this, in case anyone
is interested.

----------
nosy: +senn

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue4610>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue4610] Unicode case mappings are incorrect

Reply via email to