On 2/13/06, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > Guido van Rossum wrote: > >>In py3k, when the str object is eliminated, then what do you have? > >>Perhaps > >>- bytes("\x80"), you get an error, encoding is required. There is no > >>such thing as "default encoding" anymore, as there's no str object. > >>- bytes("\x80", encoding="latin-1"), you get a bytestring with a > >>single byte of value 0x80. > > > > Yes to both again. > > Please reconsider, and don't give bytes() an encoding= argument. > It doesn't need one. In Python 3, people should write > > "\x80".encode("latin-1") > > if they absolutely want to, although they better write > > bytes([0x80]) > > Now, the first form isn't valid in 2.5, but > > bytes(u"\x80".encode("latin-1")) > > could work in all versions.
In 3.0, I agree that .encode() should return a bytes object. I'd almost be convinced that in 2.x bytes() doesn't need an encoding argument, except it will require excessive copying. bytes(u.encode("utf8")) will certainly use 2*len(u) bytes space (plus a constant); bytes(u, "utf8") only needs len(u) bytes. In 3.0, bytes(s.encode(xxx)) would also create an extra copy, since the bytes type is mutable (we all agree on that, don't we?). I think that's a good enough argument for 2.x. We could keep the extended API as an alternative form in 3.x, or automatically translate calls to bytes(x, y) into x.encode(y). BTW I think we'll need a new PEP instead of PEP 332. The latter has almost no details relevant to this discussion, and it seems to treat bytes as a near-synonym for str in 2.x. That's not the way this discussion is going it seems. -- --Guido van Rossum (home page: http://www.python.org/~guido/) _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com