>>>>> "Greg" == Greg Ewing <[EMAIL PROTECTED]> writes:
Greg> Stephen J. Turnbull wrote: >> the kind of "text" for which Unicode was designed is normally >> produced and consumed by people, who wll pt up w/ ll knds f >> nnsns. Base64 decoders will not put up with the same kinds of >> nonsense that people will. Greg> The Python compiler won't put up with that sort of nonsense Greg> either. Would you consider that makes Python source code Greg> binary data rather than text, and that it's inappropriate to Greg> represent it using a unicode string? The reason that Python source code is text is that the primary producers/consumers of Python source code are human beings, not compilers. There are no such human producers/consumers of base64. Unless you prefer that I expressed that last sentence as "VGhlIHJlYXNvbiB0aG F0IFB5dGhvbiBzb3VyY2UgY29kZSBpcyB0ZXh0IGlzIGJlY2F1c2UgdGhlIHByaW1 hcnkKcHJvZHVjZXJzL2NvbnN1bWVycyBvZiBQeXRob24gc291cmNlIGNvZGUgYXJl IGh1bWFuIGJlaW5ncywgbm90CmNvbXBpbGVycy4="? >> You're basically assuming that the person who implements the >> code that processes a Unicode string is the same person who >> implemented the code that converts a binary object into base64 >> and inserts it into a string. Greg> No, I'm assuming the user of base64 knows the Greg> characteristics of the channel he's using. Yes, which implies that you assume he has control of the data all the way to the channel that actually requires base64. Use case: the Gnus MUA supports the RFC that allows non-ASCII names in MIME headers that take file names. The interface was written for message-at-a-time use, which makes sense for composition. Somebody else added "save and strip part" editing capability, but this only works one MIME part at a time. So if you have a message with four MIME parts and you save and strip all of them, the first one gets encoded four times. The reason for *this* bug, and scores like it over the years, is that somebody made it convenient to put wire protocols into a text document. Shouldn't Python do better than that? Shouldn't Python text be for humans, rather than be whatever had the tag "character" attached to it for convenience of definition of a protocol for communication of data humans can't process without mechanical assistance? >> I don't think it's a good idea to gratuitously introduce wire >> protocols as unicode codecs, Greg> I am *not* saying that base64 is a unicode codec! If that's Greg> what you thought I was saying, it's no wonder we're Greg> confusing each other. I know you don't think that it's a duck, but it waddles and quacks. Ie, the question is not what I think you're saying. It's "what is the Python compiler/interpreter going to think?" AFAICS, it's going to think that base64 is a unicode codec. Greg> The only time I need to use something like base64 is when I Greg> have something that will only accept text. In Py3k, "accepts Greg> text" is going to mean "takes a character string as input", Characters are inherently abstract, as a class they can't be instantiated as input or output---only derived (ie, encoded) characters can. I don't believe that "takes a character string as input" has any intrinsic meaning. Greg> Does that make it clearer what I'm getting at? No.<wink> I already understood what you're getting at. As I said, I'm sympathetic in principle. In practice, I think it's a loaded gun aimed at my foot. And yours. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com