Martin Panter added the comment:

There is an inconsistency when parsing with headersonly=True. According to the 
documentation, get_payload() with message/rfc822 should should return a list of 
Message objects, not a string. But using headersonly=True produces a 
non-multipart Message object:

>>> m = Parser().parsestr("Content-Type: message/rfc822\r\n\r\n", 
>>> headersonly=True)
>>> m.get_content_type()
'message/rfc822'
>>> m.is_multipart()  # Doc says True
False
>>> m.get_payload()  # Doc says list of Message objects
''

Related to this, setting headersonly=True can also cause a internal 
inconsistency. Maybe this is why it was called a “hack”:

>>> Parser().parsestr("Content-Type: message/delivery-status\r\nInvalid 
>>> line\r\n\r\n", headersonly=True).as_string()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.5/email/message.py", line 159, in as_string
    g.flatten(self, unixfrom=unixfrom)
  File "/usr/lib/python3.5/email/generator.py", line 115, in flatten
    self._write(msg)
  File "/usr/lib/python3.5/email/generator.py", line 181, in _write
    self._dispatch(msg)
  File "/usr/lib/python3.5/email/generator.py", line 214, in _dispatch
    meth(msg)
  File "/usr/lib/python3.5/email/generator.py", line 331, in 
_handle_message_delivery_status
    g.flatten(part, unixfrom=False, linesep=self._NL)
  File "/usr/lib/python3.5/email/generator.py", line 106, in flatten
    old_msg_policy = msg.policy
AttributeError: 'str' object has no attribute 'policy'

I think it may be best only change get_payload() to return a string in the next 
Python version (3.7), with appropriate documentation updates. For existing 
Python versions, perhaps urllib3 could check if the list returned by 
get_payload() only has trivial empty Message objects (no header fields and only 
empty payloads themselves).

If we agree that only a feature change for 3.7 is appropriate, there are other 
problems with the current parsing of HTTP headers that could also be looked at:

* Only a blank line should end a header section (Issue 24363, Issue 26686)
* “From” line should be a defect
* Use “email” package’s HTTP parsing policy
* Don’t assume Latin-1 encoding (Issue 27716)
* Avoid double-handling (header lines are parsed in http.client, then joined 
together and parsed again in email.feedparser)

----------
components: +email
nosy: +barry, martin.panter
versions:  -Python 3.3, Python 3.4

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29353>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to