Tres Seaver writes: > On 04/23/2013 09:29 AM, Stephen J. Turnbull wrote: > > By RFC specification, BASE64 is a *textual* representation of > > arbitrary binary data. > > It isn't "text" in the sense Py3k means:
RFC 4648 repeatedly refers to *characters*, without specifying an encoding for them. In fact, if you copy accurately, you can write BASE64 on a napkin and that napkin will accurate transmit the data (assuming it doesn't run into sleet or gloom of night). What else is that but "text in the sense of Py3k"? My point is not that Python's base64 codec *should* be bytes-to-str and back. My point is that, both in the formal spec and in historical evolution, that is a plausible interpretation of ".encode('base64')" which happens to be the reverse of the normal codec convention, where ".encode(codec)" is a *string* method, and ".decode(codec)" is a *bytes* method. This is not harder to learn for people (for BASE64 encoding or for coded character sets), because in each case there's a natural sense of direction for *en*coding vs. *de*coding. But it does break duck- typing, as does the web developer bytes-to-bytes usage of BASE64. What I'm groping toward is an idea of a "variable method", so that we could use .encode and .decode where they are TOOWTDI for people even though a purely formal interpretation of duck-typing would say "but why is that blue whale quacking, waddling, and flying?" In other words (although I have no idea how best to implement it), I would like "somestring.encode('base64')" to fail with "I don't know how to do that" (an attribute lookup error?), the same way that "somebytes.encode('utf-8')" does in Python 3 today. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com