Ron Adam <[EMAIL PROTECTED]> wrote: > Josiah Carlson wrote: > > Bengt Richter had a good idea with bytes.recode() for strictly bytes > > transformations (and the equivalent for text), though it is ambiguous as > > to the direction; are we encoding or decoding with bytes.recode()? In > > my opinion, this is why .encode() and .decode() makes sense to keep on > > both bytes and text, the direction is unambiguous, and if one has even a > > remote idea of what the heck the codec is, they know their result. > > > > - Josiah > > I like the bytes.recode() idea a lot. +1 > > It seems to me it's a far more useful idea than encoding and decoding by > overloading and could do both and more. It has a lot of potential to be > an intermediate step for encoding as well as being used for many other > translations to byte data.
Indeed it does. > I think I would prefer that encode and decode be just functions with > well defined names and arguments instead of being methods or arguments > to string and Unicode types. Attaching it to string and unicode objects is a useful convenience. Just like x.replace(y, z) is a convenience for string.replace(x, y, z) . Tossing the encode/decode somewhere else, like encodings, or even string, I see as a backwards step. > I'm not sure on exactly how this would work. Maybe it would need two > sets of encodings, ie.. decoders, and encoders. An exception would be > given if it wasn't found for the direction one was going in. > > Roughly... something or other like: > > import encodings > > encodings.tostr(obj, encoding): > if encoding not in encoders: > raise LookupError 'encoding not found in encoders' > # check if obj works with encoding to string > # ... > b = bytes(obj).recode(encoding) > return str(b) > > encodings.tounicode(obj, decodeing): > if decoding not in decoders: > raise LookupError 'decoding not found in decoders' > # check if obj works with decoding to unicode > # ... > b = bytes(obj).recode(decoding) > return unicode(b) > > Anyway... food for thought. Again, the problem is ambiguity; what does bytes.recode(something) mean? Are we encoding _to_ something, or are we decoding _from_ something? Are we going to need to embed the direction in the encoding/decoding name (to_base64, from_base64, etc.)? That doesn't any better than binascii.b2a_base64 . What about .reencode and .redecode? It seems as though the 're' added as a prefix to .encode and .decode makes it clearer that you get the same type back as you put in, and it is also unambiguous to direction. The question remains: is str.decode() returning a string or unicode depending on the argument passed, when the argument quite literally names the codec involved, difficult to understand? I don't believe so; am I the only one? - Josiah _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com