Ezio Melotti <ezio.melo...@gmail.com> added the comment:

So the issue here is that while using combing chars, str.title() fails to 
titlecase the string properly.

The algorithm implemented by str.title() [0] is quite simple: it loops through 
the code units, and uppercases all the chars that follow a char that is not 
lower/upper/titlecased.
This means that if Déme doesn't use combining accents, the char before the 'm' 
is 'é', 'é' is a lowercase char, so 'm' is not capitalized.
If the 'é' is represented as 'e' + '´', the char before the 'm' is '´', '´' is 
not a lower/upper/titlecase char, so the 'm' is capitalized.

I guess we could normalize the string before doing the title casing, and then 
normalize it back.
Also the str methods don't claim to follow Unicode afaik, so unless we decide 
that they should, we could implement whatever algorithm we want.

[0]: Objects/unicodeobject.c:6752

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue12737>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to