On 7/29/07, Greg Ewing <[EMAIL PROTECTED]> wrote: > Martin v. Löwis wrote: > > The point that proponents of "base64 encoding should > > yield strings" miss is that US-ASCII is *both* a character set, > > and an encoding. > > Last time we discussed this, I went and looked at the > RFC where base64 is defined. According to my reading of > it, nowhere does it say that base64 output must be > encoded as US-ASCII, nor any other particular encoding. > > It *does* say that the characters used were chosen because > they are present in a number of different character sets > in use at the time, and explicity mentions EBCDIC as one > of those character sets. > > To me this quite clearly says that base64 is defined at > the level of characters, not encodings.
I think it's all besides the point. We should look at the use cases. I recall finding out once that a Java base64 implementation was much slower than Python's -- turns out that the Java version was converting everything to Strings; then we needed to convert back to bytes in order to output them. My suspicion is that in the end using bytes is more efficient *and* more convenient; it might take some looking through the email package to confirm or refute this. (The email package hasn't been converted to work in the struni branch; that should happen first. Whoever does that might well be the one who tells us how they want their base64 APIs.) An alternative might be to provide both string- and bytes-based APIs, although that doesn't help with deciding what the default one (the one that uses the same names as 2.x) should do. -- --Guido van Rossum (home page: http://www.python.org/~guido/) _______________________________________________ Python-3000 mailing list [email protected] http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
