Ezio Melotti <ezio.melo...@gmail.com> added the comment:

If we can't fix the behavior, it should at least be documented.

Currently the docs says "This function returns a list of (decoded_string, 
charset) pairs containing each of the decoded parts of the header.".  One could 
assume that this means that a Unicode string is returned, but and as far as I 
can tell, "decoded_string" means decoded from the format used by the header, 
not from bytes -- in fact the example below shows a byte string.
#24797 suggest an alternative solution, but there is no indications about it in 
the docs except an easy-to-miss note about the new API at the top.

Coincidentally as I was reporting this issue I also found the recently opened 
#37139.  There are also a few other reports: #24797, #37139, #32975, #6302, 
#4661.

If this method is not actually deprecated, I would document the current 
behavior (i.e. sometimes it returns bytes, sometimes unicode -- bonus points if 
there's a simple rule to predict which one), explain that it exists for 
legacy/backward-compatibility reasons, and point to the alternatives.


FWIW here are 3 more samples that show the inconsistency.

>>> from email.header import decode_header
>>> # str + None
>>> h = '\x80SOKCrGxsbw===== <he...@example.com>'; decode_header(h)
[('\x80SOKCrGxsbw===== <he...@example.com>', None)]
>>> # bytes + '', bytes + None
>>> h = '=??b?SOKCrGxsbw=====?= <he...@example.com>'; decode_header(h)
[(b'H\xe2\x82\xacllo', ''), (b' <he...@example.com>', None)]
>>> # bytes + 'utf8', bytes + None
>>> h = '=?utf8?b?SOKCrGxsbw==?= <he...@example.com>'; decode_header(h)
[(b'H\xe2\x82\xacllo', 'utf8'), (b' <he...@example.com>', None)]

----------
assignee:  -> docs@python
components: +Documentation
nosy: +docs@python, ezio.melotti, louis.abra...@yahoo.fr
resolution: duplicate -> 
stage: resolved -> needs patch
status: closed -> open
type: behavior -> enhancement

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue21492>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to