Bill Janssen <[EMAIL PROTECTED]> added the comment: My main fear with this patch is that "unquote" will become seen as unreliable, because naive software trying to parse URLs will encounter uses of percent-encoding where the encoded octets are not in fact UTF-8 bytes. They're just some set of bytes. A secondary concern is that it will invisibly produce invalid data, because it decodes some non-UTF-8-encoded string that happens to only use UTF-8-valid sequences as the wrong string value.
Now, I have to confess that I don't know how common these use cases are in actual URL usage. It would be nice if there was some organization that had a large collection of URLs, and could provide a test set we could run a scanner over :-). As a workaround, though, I've sent a message off to Larry Masinter to ask about this case. He's one of the authors of the URI spec. _______________________________________ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue3300> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com