On 7/14/2017 5:51 PM, Marko Rauhamaa wrote:

Yes, in Python2, Go, C and GNU textutils, when you print a text string
containing a mixture of languages, you see characters.

Why?

Because that's what the terminal emulator chooses to do upon receiving
those bytes.

>>> s = u'\u1171\u2222\u3333\u4444\u5555'
>>> s
u'\u1171\u2222\u3333\u4444\u5555'
>>> print(s)
ᅱ∢㌳䑄啕
>>> b = s.encode('utf-8')
>>> b
'\xe1\x85\xb1\xe2\x88\xa2\xe3\x8c\xb3\xe4\x91\x84\xe5\x95\x95'
>>> print(b)
ᅱ∢㌳䑄啕

I prefer the accurate 5 char print of the text string to the print of the bytes.

--
Terry Jan Reedy


--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to