On Feb 22, 2006, at 6:35 AM, Greg Ewing wrote: > I'm thinking of convenience, too. Keep in mind that in Py3k, > 'unicode' will be called 'str' (or something equally neutral > like 'text') and you will rarely have to deal explicitly with > unicode codings, this being done mostly for you by the I/O > objects. So most of the time, using base64 will be just as > convenient as it is today: base64_encode(my_bytes) and write > the result out somewhere. > > The reason I say it's *corrrect* is that if you go straight > from bytes to bytes, you're *assuming* the eventual encoding > is going to be an ascii superset. The programmer is going to > have to know about this assumption and understand all its > consequences and decide whether it's right, and if not, do > something to change it. > > Whereas if the result is text, the right thing happens > automatically whatever the ultimate encoding turns out to > be. You can take the text from your base64 encoding, combine > it with other text from any other source to form a complete > mail message or xml document or whatever, and write it out > through a file object that's using any unicode encoding > at all, and the result will be correct.
This makes little sense for mail. You combine *bytes*, in various and possibly different encodings to form a mail message. Some MIME sections might have a base64 Content-Transfer-Encoding, others might be 8bit encoded, others might be 7bit encoded, others might be quoted- printable encoded. Before the C-T-E encoding, you will have had to do the Content-Type encoding, coverting your text into bytes with the desired character encoding: utf-8, iso-8859-1, etc. Having the final mail message be made up of "characters", right before transmission to the socket would be crazy. James _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com