Xah Lee <xah...@gmail.com> writes: > Perl's exceedingly lousy unicode support hack is well known. In fact > it is the primary reason i “switched” to python for my scripting needs > in 2005. (See: Unicode in Perl and Python)
I think your assessment is antiquated. I've been doing Unicode programming with Perl for about three years, and it's generally quite wonderfully transparent. On the programmers' web site stackoverflow.com, I flag questions with the "unicode" tag, and of questions that mention a specific language, Python and C++ seem to come up the most often. > I'll have to say, as far as text processing goes, the most beautiful > lang with respect to unicode is emacs lisp. In elisp code (e.g. > Generate a Web Links Report with Emacs Lisp ), i don't have to declare > none of the unicode or encoding stuff. I simply write code to process > string or buffer text, without even having to know what encoding it > is. Emacs the environment takes care of all that. It's not quite perfect, though. I recently discovered that if I enter a Chinese character using my Mac's Chinese input method, and then enter the same character using a Japanese input method, Emacs regards them as different characters, even though they have the same Unicode code point. For example, from describe-char: character: 一 (43323, #o124473, #xa93b, U+4E00) character: 一 (55404, #o154154, #xd86c, U+4E00) On saving and reverting a file containing such text, the characters are "normalized" to the Japanese version. I suppose this might conceivably be the correct behavior, but it sure was a surprise that (equal "一" "一") can be nil. -- http://mail.python.org/mailman/listinfo/python-list