[issue2275] urllib2 header capitalization
Rui Carmo rca...@gmail.com added the comment: I'd like to add that when supplying custom headers for things like UPNP (which uses SOAPACTION as a header to talk to frequently very limited servers), the library shouldn't mangle the headers in any way whatsoever and send them verbatim. (I consider that mangling to be a bug, and not a new feature. HTTP headers may be case-insensitve according to standards, but embedded implementations require us to have a degree of control over the headers, and failing to preserve header case is a bug.) Right now I've had to replace httplib and urllib2 with my own custom code because the SOAPACTION header is capitalized and sent to the server as Soapaction, which breaks the Intel embedded UPNP daemon. -- nosy: +Rui.Carmo type: feature request - behavior versions: +Python 2.7 -Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
R. David Murray rdmur...@bitdance.com added the comment: No idea if this is even still valid (I skimmed the issue, I did not try to understand it in detail), but I agree that a change like this is more feature than bug fix, so I'm updating the issue settings accordingly. -- nosy: +r.david.murray type: behavior - feature request versions: +Python 3.3 -Python 2.6, Python 3.0 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
Senthil Kumaran orsent...@gmail.com added the comment: I think, we need to move forward with this. It is one of the earliest i worked on. I shall look at John's patch and look through it's inclusion in the code. -- assignee: facundobatista - orsenthil ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
John J Lee [EMAIL PROTECTED] added the comment: I have attached a patch that just: * Improves doctests a bit * Changes .get_headers() and .has_header() to be case-insensitive * Documents .get_header() and .header_items(), fixes some incorrectly-documented argument names, and notes the case-sensitivity change * Note that the headers passed to httplib (the original issue for which this bug was raised) were already correctly Title-Cased, and that is unchanged by this patch Options: * Apply my patch. I'd be happy with this. * Apply my patch and begin the process of deprecating public interfaces .unredirected_hdrs and .headers. Perhaps provide another interface to tell whether a header will be redirected (only if there's a use case for that). I'd be happy with this too. * Bring back Senthil's case-normalized .headers and .unredirected_hdrs and document those interfaces. This a bad idea, because this would result in a very confusing set of interfaces for dealing with headers (see my previous comments on this -- Date: 2008-07-11 19:44). (For what it's worth, I have also attached a doctest to show some examples of the broken invariants issue with Senthil's patch. The doctest also comments on the fact that making .headers case-insensitive in this way is quite confusing in the case where multiple items of different case are present, but __getitem__ returns only a single item -- this is a relatively minor issue, but still worth avoiding. The variation of choosing to discard items that normalize to the same string would avoid this problem, though it might break working code that relies on sending multiple headers with differing case, so I think this would be no better overall.) Added file: http://bugs.python.org/file11886/issue2775.patch ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
Changes by John J Lee [EMAIL PROTECTED]: Added file: http://bugs.python.org/file11887/issue2775-problems.patch ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
John J Lee [EMAIL PROTECTED] added the comment: The CaseInsensitive dict class fails to preserve its invariants (implied invariants, since there are no tests for it). There are also problems with the documentation in the patch. I will submit a modified patch, I hope later this week. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
John J Lee [EMAIL PROTECTED] added the comment: By the way, this is a feature addition, not a bug fix. The first beta releases for 2.6 and 3.0 came out some time ago, so according to PEP 361, this change should not be committed to trunk until after the 2.6 / 3.0 maintenance branches have been created. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
Senthil [EMAIL PROTECTED] added the comment: Facundo, Shall we go ahead with committing these changes? ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
Facundo Batista [EMAIL PROTECTED] added the comment: I'm ok with these patchs, Senthil. John, what do you think about this? ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
Changes by Senthil [EMAIL PROTECTED]: Added file: http://bugs.python.org/file11024/issue2275-py3k.diff ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
Senthil [EMAIL PROTECTED] added the comment: I am submitting a revised patch for this issue. I did some analysis on the history of this issue and found that this .capitalize() vs .title() changes had come up earlier too ( issue1542948)and decision was taken to: - To retain the Header format in .capitalize() to maintain backward compatibility. - However, when the headers are passed to httplib, they are converted to .title() format ( see AbstractHTTPHandler method ) - It is encouraged that users uses .add_header(), .has_header(), .get_header() methods to check for headers instead of using the .headers dict directly (which will still remain undocumented interface). Note to Hans-Peter would be: Changing the headers to .title() tends to make the .header_items() retrieval backward incompatible, so the headers will still be stored in .capitalize() format. And I have made the following changes to the patch: 1) Support for case insensitive dict look up which will work with for .has_header, .get_header(). So when .has_header(User-Agent) will return True even when .headers give {User-agent:blah} 2) Added tests to tests the behavior. 3) Changes to doc to reflect upon this issue. Btw, the undocumented .headers interface will also support case-insensitive lookup, so I have added tests for that too. Let me know if you have any comments. Lets plan to close this issue. Thanks, ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
Changes by Senthil [EMAIL PROTECTED]: Removed file: http://bugs.python.org/file10862/issue2275-py26.diff ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
Changes by Senthil [EMAIL PROTECTED]: Removed file: http://bugs.python.org/file10863/issue2275-py3k.diff ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
Changes by Senthil [EMAIL PROTECTED]: Added file: http://bugs.python.org/file11023/issue2275-py26.diff ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
John J Lee [EMAIL PROTECTED] added the comment: With respect to point 1), I assume that we all agree upon that headers should stored in Titled-Format instead of Capitalized-format. I would probably choose to store the headers in Capitalized-form, because that makes implementing .headers trivial. [...] Now, if we go for a Case Normalization at the much later stage, will the headers be stored still in capitalize() format? ( In that case, this bug requests it be stored in .titled() format confirming to many practices) Would you like to explain a bit more on that? Implement .get_header() and friends using .headers, along the lines of: def get_header(self, header_name, default=None): return self.headers.get( header_name, self.unredirected_hdrs.get(header_name, default)).title() And then ensure that the headers actually passed to httplib also get .title()-cased. This also has the benefit, compared with your patch, of leaving the behaviour of non-HTTP URL schemes unchanged. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
John J Lee [EMAIL PROTECTED] added the comment: Of course, that along the lines of suggestion isn't quite right: None does not have a .title() method. (and, to spell it out, I'm assuming in that suggestion that .headers is the dict of headers with .capitalize()d keys, i.e. unchanged from Python 2.5) ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
Senthil [EMAIL PROTECTED] added the comment: Sorry for the delay and my miss in further communication on this issue. I would like to take this issue in two fronts for its closure. 1) Issue with headers .capitalize() vs .title() 2) Documenting the Interface With respect to point 1), I assume that we all agree upon that headers should stored in Titled-Format instead of Capitalized-format. So I went ahead with the implementation of Titled format with a CaseInsensitive Lookup so that previous code using Capitalize format would also return values from the headers dict. John: I agree with your point that these changes would break the .header_items() that returns a list of Titled() key-value pairs, whereas the previous existing code would be expecting Capitalized key-value pairs. CaseInsensitive Dict lookup would not solve it. I had assumed that new code will be confirming to it and changed the tests. Even though I thought about it, I did not bring it up for discussion for backward compatibility header_items() method. - I don't have a solution for how to make header_items() backward compatible if we go for headers title() change. I shall try to come up by today. Now, if we go for a Case Normalization at the much later stage, will the headers be stored still in capitalize() format? ( In that case, this bug requests it be stored in .titled() format confirming to many practices) Would you like to explain a bit more on that? We can address the documentation of interface later to coming upon conclusion on the first one. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
Hans-Peter Jansen [EMAIL PROTECTED] added the comment: Facundo, first of all: *really* nice work, thanks a lot. While I don't fully understand the issues raised lately here, especially what broke (user code). I can see, that it completely solves my original problem, which is great. While reading the patch, I noticed, that the modifications to Doc/library/urllib2.rst contain two typos (retrive instead of retrieve). The only remaining issue left in this area is using some form of ordered dict for headers in order to control - how it comes - the order of headers ;-), but that's left for another issue to raise some day. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
John J Lee [EMAIL PROTECTED] added the comment: There already is a test for the breakage, but the patch changes the expected output to match the new return value of .header_items(): -[('Foo-bar', 'baz'), ('Spam-eggs', 'blah')] +[('Foo-Bar', 'baz'), ('Spam-Eggs', 'blah')] Code that previously worked fine, for example code that iterates through .header_items() and does string comparisons on the header names, or does in tests on the keys of dict(header_items), etc. will break, the existence of .has_header() notwithstanding. What is the purpose of this change? Can you explain how that justifies breaking working code? The alternative to this change is to leave the return value of .header_items() unchanged. That could be done by performing case normalisation at a later stage. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
John J Lee [EMAIL PROTECTED] added the comment: * The patch to the docs seems to muddy the waters even further (than the current slightly murky state) about whether and why .headers is to be preferred over the methods, or vice-versa. I think .headers should remain undocumented, for the reason stated by the doctest that failed with Hans-Peter's original patch. IIRC, Hans-Peter's comment was on the reference to .headers undocumented interface mentioned in the test_urllib2. I made no reference to Hans-Peter's comment -- only to his patch, so I don't know what you're getting at here. Could you respond to my comment, please? The best course of action is debatable, but the patch is a regression in this respect, so should not be committed as-is. My understanding in this case was to address 1) Title()-ize the headers and 2) provide a case insensitive lookup. Can you explain why you think providing case-insensitive lookup requires documenting .headers? Is this sufficient now to expose the headers method? If not, what else? If headers method should not be exposed, then will the 2 cases addressed above still do, as this issue request was opened for that? There is no method named headers. I think that the .headers attribute should never be documented, because it does not contain the unredirected headers. Changing that behaviour would break code, further confuse matters and complicate writing code that works with multiple versions of Python. A case *could* be made for changing it (by including the unredirected headers), because other code will have been broken by the introduction of .unredirected_hdrs; I prefer instead to stick with old breakage rather than swapping it for new breakage, but as I say, the best course of action is debatable. Another choice would be to start the process of deprecating .headers, and introduce a new attribute with a similar function. As long as your chosen solution isn't actually a step backwards or sharply sideways, I probably won't object very strongly. What isn't really debatable is that the patch makes things worse here: it adds a newly-documented interface which is subtly and surprisingly different from the other one (an unacceptable change in itself, IMO) without even explaining the difference between the two, why they are different, and why one would want to use or avoid one or other interface. There are other problems with the documentation patch, but there's no point in discussing those until the changes are agreed on. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
John J Lee [EMAIL PROTECTED] added the comment: Just to quickly note that by providing case-insensitive lookup I don't necessarily mean via .headers. But it's you who's providing the patch, so I'll wait for your next suggestion rather than discuss how I might change the code. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
John J Lee [EMAIL PROTECTED] added the comment: * The patch looks like it will break code that uses .header_items(), which is not acceptable. * The patch to the docs seems to muddy the waters even further (than the current slightly murky state) about whether and why .headers is to be preferred over the methods, or vice-versa. I think .headers should remain undocumented, for the reason stated by the doctest that failed with Hans-Peter's original patch. The best course of action is debatable, but the patch is a regression in this respect, so should not be committed as-is. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
Facundo Batista [EMAIL PROTECTED] added the comment: John: You say that it will break code because it changes the capitalization policy, or because other reason? Do you think that there's a way to fix this issue and not break the code? If you really think that this breaks code, please provide a test case. Regarding .headers, having a public attribute not documented never is a better solution than document it, with its benefits and its shortcomings. Thanks. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
Senthil [EMAIL PROTECTED] added the comment: John J Lee [EMAIL PROTECTED] added the comment: * The patch looks like it will break code that uses .header_items(), which is not acceptable. Nope, it does not break. If the concern was subclassing dict may have adverse effect on .copy and .update methods, thats taken care because the subclass just passes it to the original dict method, which would behave the same way. r.header_items() [('Spam-Eggs', 'blah')] r.add_header(Foo-Bar, baz) items = r.header_items() items.sort() items [('Foo-Bar', 'baz'), ('Spam-Eggs', 'blah')] * The patch to the docs seems to muddy the waters even further (than the current slightly murky state) about whether and why .headers is to be preferred over the methods, or vice-versa. I think .headers should remain undocumented, for the reason stated by the doctest that failed with Hans-Peter's original patch. IIRC, Hans-Peter's comment was on the reference to .headers undocumented interface mentioned in the test_urllib2. The best course of action is debatable, but the patch is a regression in this respect, so should not be committed as-is. My understanding in this case was to address 1) Title()-ize the headers and 2) provide a case insensitive lookup. Is this sufficient now to expose the headers method? If not, what else? If headers method should not be exposed, then will the 2 cases addressed above still do, as this issue request was opened for that? ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
Facundo Batista [EMAIL PROTECTED] added the comment: Senthil: patch is fine. Remember to provide not only a modification for docs, but also to the Misc/NEWS file. Thank you!! ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
Changes by Senthil [EMAIL PROTECTED]: Removed file: http://bugs.python.org/file10849/issue2275-py26.diff ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
Senthil [EMAIL PROTECTED] added the comment: Here is the final patch for Py26 and Py3k including the Docs and Misc/News. Thanks you, Senthil Added file: http://bugs.python.org/file10862/issue2275-py26.diff ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
Changes by Senthil [EMAIL PROTECTED]: -- versions: -Python 2.5 Added file: http://bugs.python.org/file10863/issue2275-py3k.diff ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
Senthil [EMAIL PROTECTED] added the comment: I also removed the Python 2.5 from the Version, as I don't think these changes will be back ported. After the application of patch, this issue can be closed. ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
Senthil [EMAIL PROTECTED] added the comment: Please have a look at this patch. - Included a CaseInsensitiveDict Lookup for Headers interface. - Headers will now be .title()-ed instead of .capitalized() ed. - Included Tests for the changes made. In the test_urllib2, I have not removed this line (yet). The Request.headers dictionary is not a documented interface.. - I shall attach the patch to the documentation next. Will this suffice to remove the declaration of not a documented interface? Please provide your comments on the attached patch. If this is fine, I shall do the same modifications for py3k and patch docs as well. Thanks! Added file: http://bugs.python.org/file10849/issue2275-py26.diff ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
Facundo Batista [EMAIL PROTECTED] added the comment: Senthil: We would need some tests to assure this will keep working ok in the future Also as this is (somehow) a new functionality, we'd need to modify the NEWS file and maybe even the docs (a comment about this case insensitivity? Could you please send a patch for these? Thank you! -- assignee: - facundobatista nosy: +facundobatista ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
Changes by Senthil [EMAIL PROTECTED]: Removed file: http://bugs.python.org/file9907/issue2275.patch ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
Senthil [EMAIL PROTECTED] added the comment: Issue applicable to Py2.6 and Py3K. Previous patch attached was wrong. Removed it. -- versions: +Python 2.6, Python 3.0 ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
Martin McNickle [EMAIL PROTECTED] added the comment: This looks good. I would suggest that the unredirected_hdrs would use the CaseInsensitiveDict too. There is still the problem (from the test documentation) that accessing .headers directly will only show a subset of headers i.e. it won't show any headers from .unredirected_hdrs. Have you any suggestions as to how they both can be accessed from the same interface? The test documentation also says: Note the case normalization of header names here, to .capitalize()-case. This should be preserved for backwards-compatibility. (In the HTTP case, normalization to .title()-case is done by urllib2 before sending headers to httplib). It suggests that capitalize() should be kept for backwards compatibility. I have tested and the headers actually sent to the server are in title()-case even though they are stored in the Request object as captitalize()-case. This would initially suggest that the case-normalization should be removed from the patch. However, as .headers would now be using a case-insensitive dictionary, this would still ensure backwards compatibily. -- nosy: +BitTorment __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
Hans-Peter Jansen [EMAIL PROTECTED] added the comment: But should not this patch be handled in a way wherein. key.capitalize() is just replaced by key.upper()? Hmm, are you sure? hello.upper() 'HELLO' but the issue is with values containing dashes: 'accept-charset'.capitalize() 'Accept-charset' whereas the rest of the world would expect: 'Accept-Charset' ^ This is especially useful, if you try to emulate the behavior of a typical browser as close as possible. __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
John J Lee [EMAIL PROTECTED] added the comment: Specifically, these improvements could be made: * the headers actually sent to httplib could be normalized to Standard-Http-Case by urllib2 * the urllib2.Request.headers interface could support case-insensitive key lookup __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
John J Lee [EMAIL PROTECTED] added the comment: urllib2.Request.headers is, in practice, an undocumented public interface. Did you run the tests? There is room for improvement here, but not in the way you suggest. python[1]$ python2.6 iPython 2.6a1+ (trunk:62045M, Mar 30 2008, 03:07:23) [GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2 Type help, copyright, credits or license for more information. import test.test_urllib2 print test.test_urllib2.test_request_headers_dict.__doc__ The Request.headers dictionary is not a documented interface. It should stay that way, because the complete set of headers are only accessible through the .get_header(), .has_header(), .header_items() interface. However, .headers pre-dates those methods, and so real code will be using the dictionary. The introduction in 2.4 of those methods was a mistake for the same reason: code that previously saw all (urllib2 user)-provided headers in .headers now sees only a subset (and the function interface is ugly and incomplete). A better change would have been to replace .headers dict with a dict subclass (or UserDict.DictMixin instance?) that preserved the .headers interface and also provided access to the unredirected headers. It's probably too late to fix that, though. Check .capitalize() case normalization: url = http://example.com; Request(url, headers={Spam-eggs: blah}).headers[Spam-eggs] 'blah' Request(url, headers={spam-EggS: blah}).headers[Spam-eggs] 'blah' Currently, Request(url, Spam-eggs).headers[Spam-Eggs] raises KeyError, but that could be changed in future. -- nosy: +jjlee __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
Senthil [EMAIL PROTECTED] added the comment: Hi John, Greetings! I agree with both of your suggestions. Attached is the patch which aims to implement both in one go. Please provide your comments on that. If this method is okay, I shall go ahead with patches for tests and attach it also. Thanks, Senthil Added file: http://bugs.python.org/file9906/issue2275.patch __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
Changes by Senthil [EMAIL PROTECTED]: Removed file: http://bugs.python.org/file9906/issue2275.patch __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
Changes by Senthil [EMAIL PROTECTED]: Added file: http://bugs.python.org/file9907/issue2275.patch __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
Hans-Peter Jansen [EMAIL PROTECTED] added the comment: Hi Senthil, that looks promising, and the title() trick is nice, as it fixes my issue.. Thanks, Pete __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
Senthil [EMAIL PROTECTED] added the comment: I understand your implementation of _cap_header_key function. But should not this patch be handled in a way wherein. key.capitalize() is just replaced by key.upper()? Should not that suffice? What is the difference between _cap_header_key and key.upper()? Thank you, Senthil -- nosy: +orsenthil __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2275] urllib2 header capitalization
New submission from Hans-Peter Jansen [EMAIL PROTECTED]: The urllib2 behavior related to headers is - hmm - improvable. It simply capitalize() the key, which leads to funny results like: Accept-charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 while this is seemingly conforming to the specs, it's simply different to every other implementation of such things.. And we can do better. How about: --- /usr/lib/python/urllib2.py 2008-01-10 19:03:55.0 +0100 +++ urllib2.py 2008-03-11 21:25:33.523890670 +0100 @@ -261,13 +261,16 @@ class Request: def is_unverifiable(self): return self.unverifiable +def _cap_header_key(self, key): +return '-'.join((ck.capitalize() for ck in key.split('-'))) + def add_header(self, key, val): # useful for something like authentication -self.headers[key.capitalize()] = val +self.headers[self._cap_header_key(key)] = val def add_unredirected_header(self, key, val): # will not be added to a redirected request -self.unredirected_hdrs[key.capitalize()] = val +self.unredirected_hdrs[self._cap_header_key(key)] = val def has_header(self, header_name): return (header_name in self.headers or I'm not happy with the _cap_header_key name, but you get the idea. The patch is optimized to operate with maximum locality. It's also attached. I would be very grateful, if something similar could be applied. Opinions? -- components: Library (Lib) files: urllib2-cap-headers.diff keywords: patch messages: 63466 nosy: frispete severity: minor status: open title: urllib2 header capitalization type: behavior versions: Python 2.5 Added file: http://bugs.python.org/file9658/urllib2-cap-headers.diff __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2275 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com