Out of curiosity, why do you prefer decimal values to refer to Unicode code points? Most references, http://unicode.org/charts/PDF/U0400.pdf (official) or https://en.wikibooks.org/wiki/Unicode/Character_reference/0000-0FFF , prefer to refer to them by hexadecimal as the planes and ranges are broken up by hex values.
On Wed, Dec 7, 2016 at 5:52 PM, Mikhail V <mikhail...@gmail.com> wrote: > In past discussion about inputing and printing characters, > I was proposing decimal notation instead of hex. > Since the discussion was lost in off-topic talks, I'll try to > summarise my idea better. > > I use ASCII only for code input (there are good reasons for that). > Here I'll use Python 3.6, and Windows 7, so I can use print() with unicode > directly and it works now in system console. > > Suppose I only start programming and want to do some character > manipulation. > The vey first thing I would probably start with is a simple output for > latin and cyrillic capital letters: > > caps_lat = "" > for o in range(65, 91): > caps_lat = caps_lat + chr(o) > print (caps_lat) > > caps_cyr = "" > for o in range(1040, 1072): > caps_cyr = caps_cyr + chr(o) > print (caps_cyr) > > > Which prints: > ABCDEFGHIJKLMNOPQRSTUVWXYZ > АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ > > > Say, I want now to input something direct in code: > > s = "first cyrillic letters: " + chr(1040) + chr(1041) + chr(1042) > > Which works fine and has clean look. However it is not very convinient > because of much typing and also, if I generate such strings, > adds a bit more complexity. But in general it is fine, and I use this > method currently. > > ========= > Proposal: I would want to have a possibility to input it *by decimals*: > > s = "first cyrillic letters: \{1040}\{1041}\{1042}" > or: > s = "first cyrillic letters: \(1040)\(1041)\(1042)" > > ========= > > This is more compact and seems not very contradictive with > current Python escape characters in string literals. > So backslash is a start of some escaping in most cases. > > For me most important is that in such way I would avoid > any presence of hex numbers in strings, which I find very good > for readability and for me it is very convinient since I use decimals > for processing everywhere (and encourage everyone to do so). > > So this is my proposal, any comments on this are appreciated. > > > PS: > > Currently Python 3 supports these in addition to \x: > (from https://docs.python.org/3/howto/unicode.html) > """ > If you can’t enter a particular character in your editor or want to keep > the source code ASCII-only for some reason, you can also use escape > sequences in string literals. > > >>> "\N{GREEK CAPITAL LETTER DELTA}" # Using the character name > >>> "\u0394" # Using a 16-bit hex value > >>> "\U00000394" # Using a 32-bit hex value > > """ > So I have many possibilities and all of them strangely contradicts with > my image of intuitive and readable. Well, using charater name is readable, > but seriously not much of a practical solution for input, but could be > very useful > for printing description of a character. > > > Mikhail > _______________________________________________ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/