This is a fairly specific question, but it gets at a more general issue I don't fully understand.
I recently updated httplib and urllib so that they work on the struni branch. A recurring problem with these libraries is that they call methods like strip() and split(). On a string object, calling these methods with no arguments means strip/split whitespace. The bytes object has no corresponding default arguments; whitespace may not be well-defined for bytes. (Or is it?) In general, the approach was to read data as bytes off the socket and convert header lines to iso-8859-1 before processing them. test_urllib2_localnet still fails. One of the problems is that BaseHTTPServer doesn't process HTTP responses correctly. Like httplib, it converts the HTTP status line to iso-8859-1. But it parses the rest of the headers by calling mimetools.Message, which is really rfc822.Message. The header lines of an RFC 822 message (really, RFC 2822) are ascii, so it should be easy to do the conversion. rfc822.Message assumes it is reading from a text file and that readline() returns a string. So the short question is: Should rfc822.Message require a text io object or a binary io object? Or should it except either (via some new constructor or extra arguments to the existing constructor)? I'm not sure how to design an API for bytes vs strings. The API used to be equally well suited for reading from a file or a socket, but they don't behave the same way anymore. Jeremy _______________________________________________ Python-3000 mailing list [email protected] http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
