Alexander Belopolsky added the comment: > Can a character or sequence have multiple aliases?
Yes, for example, most control characters have two aliases (and no name). 0000;NULL;control 0000;NUL;abbreviation 0001;START OF HEADING;control 0001;SOH;abbreviation 0002;START OF TEXT;control 0002;STX;abbreviation (See <http://www.unicode.org/Public/UNIDATA/NameAliases.txt>) > What will be a result type of unicodedata.name() with "abbreviation" keyword > value? Under my proposal: >>> unicodedata.name('\N{ESCAPE}', type='abbreviation') 'ESC' I would also like to consider changing the default slightly. I find the following behavior rather unhelpful: >>> unicodedata.name('\N{ESC}') Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: no such name I think most users would expect 'ESCAPE' instead. The following is more of a curiosity rather than a genuine problem, but is a good illustration for a general point: >>> unicodedata.name('\N{PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR >>> BRACKET}') 'PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRAKCET' (Note misspelled word "BRACKET" in the output.) Since "correction" alias is the official method of publishing corrections to unicode names, I think unicodedata.name() should return correct name by default. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue18234> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com