Hi, On Friday 19 November 2010 17:53:58 Alexander Belopolsky wrote: > I was recently surprised to learn that chr(i) can produce a string of > length 2 in python 3.x.
Yes, but only on narrow build. Eg. Debian and Ubuntu compile Python 3.1 in wide mode (sys.maxunicode == 1114111). > I suspect that I am not alone finding this behavior non-obvious > given that a mistake in Python manual stating the contrary survived > several releases. [1] It was a documentation bug and you fixed it. Non-BMP characters are rare, so few (maybe only you?) noticed the documentation bug. I consider the behaviour as an improvment of non-BMP support of Python3. Python is unclear about non-BMP characters: narrow build was called "ucs2" for long time, even if it is UTF-16 (each character is encoded to one or two UTF-16 words). Python2 accepts non-BMP characters with \U syntax, but not with chr(). This is inconsistent and I see this as a bug. But I don't want to touch Python2 about non-BMP characters, and the "bug" is already fixed in Python3! > I do believe, however that a change like > this [2] and its consequences should be better publicized. Change made before the release of Python 3.0. Do you want to patch the "What's new in Python 3.0?" document? > I have not > found any discussion of this change in PEPs or "What's new" documents. > The closest find was a mentioning of a related issue #3280 in the 3.0 > NEWS file. [3] Since this feature will be first documented in the > Library Reference in 3.2, I wonder if it will be appropriate to > mention it in "What's new in 3.2"? In my opinion, the question is more what was it not fixed in Python2. I suppose that the answer is something ugly like "backward compatibility" or "historical reasons" :-) Victor _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com