Martin v. Löwis <mar...@v.loewis.de> added the comment:

> Martin, do you think that str.title() should follow the Unicode standard?

I don't think that "follow the Unicode standard" has any meaning in this
context: the Unicode standard doesn't specify (AFAIK) what a .title()
method in a programming language should do.

> Should string methods work with all the normalizations or just with NFC?

When we know what .title() should do, it should do so correctly for all
strings. I try to propose a definition for .title()

"Split S into words. Change the first letter in a word to upper-case,
and all subsequent letters to lower case. A word is a sequence that
starts with a letter, followed by letter-related characters."

Letters are all characters from the "Alphabetic" category, i.e.
Lu+Ll+Lt+Lm+Lo+Nl + Other_Alphabetic.

"letter-related" characters are letters + marks (Mn, Mc, Me).

----------
title: str.title() is overzealous by upcasing combining marks inappropriately 
-> str.title() is overzealous by upcasing combining marks inappropriately

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue12737>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to