Re: unicode by default

harrismh777 Wed, 11 May 2011 23:51:17 -0700

Ben Finney wrote:

I'd phrase that as:

* Text is a sequence of characters. Most inputs to the program,
   including files, sockets, etc., contain a sequence of bytes.

* Always know whether you're dealing with text or with bytes. No object
   can be both.

* In Python 2, ‘str’ is the type for a sequence of bytes. ‘unicode’ is
   the type for text.

* In Python 3, ‘str’ is the type for text. ‘bytes’ is the type for a
   sequence of bytes.



That is very helpful...   thanks


MRAB, Steve, John, Terry, Ben F, Ben K, Ian...

...thank you guys so much, I think I've got a better picture now ofwhat is going on... this is also one place where I don't think the booksare as clear as they need to be at least for me...(Lutz, Summerfield).

So, the UTF-16 UTF-32 is INTERNAL only, for Python... and text in/out isbased on locale... in my case UTF-8 ...that is enormously helpful forme... understanding locale on this system is as mystifying as unicode isin the first place.Well, after reading about unicode tonight (about four hours) I realizethat its not really that hard... there's just a lot of details that haveto come together. Straightening out that whole tower-of-babel thing issure a pain in the butt.I also was not aware that UTF-8 chars could be up to six(6) byes longfrom left to right. I see now that the little-endianness I wasascribing to python is just a function of hexdump... and I was a littledisappointed to find that hexdump does not support UTF-8, just ascii...doh.

Anyway, thanks again... I've got enough now to play around a bit...

PS thanks Steve for that link, informative and entertaining too... Joesays, "If you are a programmer . . . and you don't know the basics ofcharacters, character sets, encodings, and Unicode, and I catch you, I'mgoing to punish you by making you peel onions for 6 months in asubmarine. I swear I will". :)









kind regards,
m harris





--
http://mail.python.org/mailman/listinfo/python-list

Re: unicode by default

Reply via email to