Bugs item #640110, was opened at 2002-11-18 15:33 Message generated for change (Comment added) made by kalinda You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=640110&group_id=5470
Category: Python Library Group: Python 2.2.2 Status: Closed Resolution: Fixed Priority: 5 Submitted By: Anders Hammarquist (iko) Assigned to: Barry A. Warsaw (bwarsaw) Summary: email.Header misparses mixed headers Initial Comment: email.Header.decode_header() misparses headers with both encoded an unencoded words. This example from RFC2047 =?ISO-8859-1?Q?Andr=E9?= Pirard <[EMAIL PROTECTED]> gets parsed as AndréPirard <[EMAIL PROTECTED]> where there should obviously be a space between André and Pirard. RFC2047 says to ignore spaces between encoded words (but not between encoded and unencoded words, though it doesn't explicitly say so from what I could find, and obviously not between unencoded words). Also, I see it's trying to handle continuation lines, but it only does it if there are encoded words in the continuation line. It barfs badly on this test case: 'Re: =?mac-iceland?q?r=8Aksm=9Arg=8Cs?= baz\n foo bar =?mac-iceland?q?r=8Aksm=9Arg=8Cs?=' I think I'll just do a patch... /Anders P.S. It seems at least remotely related to Bug#552957 ---------------------------------------------------------------------- Comment By: jonny reichwald (kalinda) Date: 2005-04-27 14:23 Message: Logged In: YES user_id=661399 I am using python 2.4 and still have this problem. To be more exact, line 73 in Header.py still strips the parts. Is there a reason for this not being fixed? ---------------------------------------------------------------------- Comment By: Anders Hammarquist (iko) Date: 2003-03-06 17:43 Message: Logged In: YES user_id=14 Looks OK. ---------------------------------------------------------------------- Comment By: Barry A. Warsaw (bwarsaw) Date: 2003-03-06 17:21 Message: Logged In: YES user_id=12800 Try current cvs. ---------------------------------------------------------------------- Comment By: Anders Hammarquist (iko) Date: 2003-03-06 15:15 Message: Logged In: YES user_id=14 The first bug is still there... With version 1.19 from CVS I get this with my example: >>> print unicode(Header.make_header(Header.decode_header('=?ISO-8859-1?Q?Andr=E9?= Pirard <[EMAIL PROTECTED]>'))).encode('latin-1') AndréPirard <[EMAIL PROTECTED]> (The problem is that whitespaces get stripped of on line 91: unenc = parts.pop(0).strip() before we know whether they are significant or not. The continuation line bug seems to be fixed however. /Anders ---------------------------------------------------------------------- Comment By: Barry A. Warsaw (bwarsaw) Date: 2003-03-06 07:50 Message: Logged In: YES user_id=12800 The first bug above has already been fixed in email 2.5 (python 2.3 cvs). The second pointed to a real bug, now fixed I believe. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=640110&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com