My response is down lower, thank you Wayne.
On 07/19/2012 12:52 PM, Wayne Werner wrote: > I'll preface my response by saying that I know/understand fairly > little about > it, but since I've recently been smacked by this same issue when > converting > stuff to Python3, I'll see if I can explain it in a way that makes sense. > > On Wed, 18 Jul 2012, Jordan wrote: > >> OK so I have been trying for a couple days now and I am throwing in the >> towel, Python 3 wins this one. >> I want to convert a string to binary and back again like in this >> question: Stack Overflow: Convert Binary to ASCII and vice versa >> (Python) >> <http://stackoverflow.com/questions/7396849/convert-binary-to-ascii-and-vice-versa-python> >> >> But in Python 3 I consistently get some sort of error relating to the >> fact that nothing but bytes and bytearrays support the buffer interface >> or I get an overflow error because something is too large to be >> converted to bytes. >> Please help me and then explian what I am not getting that is new in >> Python 3. I would like to point out I realize that binary, hex, and >> encodings are all a very complex subject and so I do not expect to >> master it but I do hope that I can gain a deeper insight. Thank you all. > > The way I've read it - stop thinking about strings as if they are > text. The > biggest reason that all this has changed is because Python has grown > up and > entered the world where Unicode actually matters. To us poor shmucks > in the > English speaking countries of the world it's all very confusing > becaust it's > nothing we have to deal with. 26 letters is perfectly fine for us - > and if we > want uppercase we'll just throw another 26. Add a few dozen puncuation > marks > and 256 is a perfectly fine amount of characters. > > To make a slightly relevant side trip, when you were a kid did you > ever send > "secret" messages to a friend with a code like this? > > A = 1 > B = 2 > . > . > . > Z = 26 > > Well, that's basically what is going on when it comes to > bytes/text/whatever. > When you input some text, Python3 believes that whatever you wrote was > encoded > with Unicode. The nice thing for us 26-letter folks is that the ASCII > alphabet > we're so used to just so happens to map quite well to Unicode > encodings - so > 'A' in ASCII is the same number as 'A' in utf-8. > > Now, here's the part that I had to (and still need to) wrap my mind > around - if > the string is "just bytes" then it doesn't really matter what the > string is > supposed to represent. It could represent the LATIN-1 character set. Or > UTF-8, -16, or some other weird encoding. And all the operations that are > supposed to modify these strings of bytes (e.g. removing spaces, > splitting on a > certain "character", etc.) still work. Because if I have this string: > > 9 45 12 9 13 19 18 9 12 99 102 > > and I tell you to split on the 9's, it doesn't matter if that's some > weird > ASCII character, or some equally weird UTF character, or something else > entirely. And I don't have to worry about things getting munged up > when I try > to stick Unicode and ASCII values together - because they're converted > to bytes > first. > > So the question is, of course, if it's all bytes, then why does it > look like > text when I print it out? Well, that's because Python converts that > byte stream > to Unicode text when it's printed. Or ASCII, if you tell it to. > > But Python3 has converted all(?) of those functions that used to > operate on > text and made them operate on byte streams instead. Except for the > ones that > operate on text ;) > > > > Well, I hope that's of some use and isn't too much of a lie - like I > said, I'm > still trying to wrap my head around things and I've found that > explaining (or > trying to explain) to someone else is often the best way to work out > the idea > in your own head. If I've gone too far astray I'm sure the other > helpful folks > here will correct me :) > Thank you for the vary informative post, every bit helps. It has certainly been a challenge for me with the new everything is bytes scheme, especially how everything has to be converted to bytes prior to going on a buffer. > HTH, > Wayne _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor