Bill Janssen <[EMAIL PROTECTED]> added the comment: Larry Masinter is off on vacation, but I did get a brief message saying that he will dig up similar discussions that he was involved in when he gets back.
Out of curiosity, I sent a note off to the www-international mailing list, and received this: ``For the authority (server name) portion of a URI, RFC 3986 is pretty clear that UTF-8 must be used for non-ASCII values (assuming, for a moment, that IDNA addresses are not Punycode encoded already). For the path portion of URIs, a large-ish proportion of them are, indeed, UTF-8 encoded because that has been the de facto standard in Web browsers for a number of years now. For the query and fragment parts, however, the encoding is determined by context and often depends on the encoding of some page that contains the form from which the data is taken. Thus, a large number of URIs contain non-UTF-8 percent-encoded octets.'' http://lists.w3.org/Archives/Public/www-international/2008JulSep/0041.html _______________________________________ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue3300> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com