New submission from Christoph Schneeberger <[EMAIL PROTECTED]>: email.Header.decode_header() does not correctly deal with multiline Headerlines. header.py in revision 54371 (1) changes the behaviour, whereas previously multiline headers where parsed correctly, header.py 54371 introduced a new regex part, that renders such headers invalid and they won't be parsed as expected. Given the following header line (doesn't matter if its parsed from a mail or read from a string) which represents IMHO a valid RFC2047 header line:
from email.Header import decode_header decode_header('=?windows-1252?Q?=22M=FCller_T=22?=\r\n <[EMAIL PROTECTED]>') this will result in: header.py (54371): [('=?windows-1252?Q?=22M=FCller_T=22?=\r\n <[EMAIL PROTECTED]>', None)] resp. with header.py (54370): [('"M\xfcller T"', 'windows-1252'), (' <[EMAIL PROTECTED]>', None)] Actually both seem parsed wrong, but with 54370 the result looks more sane (the space should be IMO removed). Once the CRLF sequence is removed from the header it works fine and all looks as expected: >>> decode_header('=?windows-1252?Q?=22M=FCller_T=22?= <[EMAIL PROTECTED]>') [('"M\xfcller T"', 'windows-1252'), ('<[EMAIL PROTECTED]>', None)] This problem might or might not be related to - issue 1372770 - issue 1467619 (1) http://svn.python.org/view?rev=54371&view=rev ---------- components: Library (Lib) messages: 65630 nosy: cschnee severity: normal status: open title: decode_header() fails on multiline headers type: behavior versions: Python 2.4, Python 2.5 __________________________________ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue2658> __________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com