Bugs item #1588217, was opened at 2006-10-31 21:06 Message generated for change (Comment added) made by gbrandl You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1588217&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.4 >Status: Closed >Resolution: Fixed Priority: 5 Private: No Submitted By: Wai Yip Tung (tungwaiyip) >Assigned to: Georg Brandl (gbrandl) Summary: quoted printable parse the sequence '= ' incorrectly Initial Comment: >>> import quopri >>> s = 'I say= a secret message\r\nThank you' >>> quopri.a2b_qp <built-in function a2b_qp> >>> quopri.decodestring(s) # use the c version binascii.a2b_qp() to decode 'I sayThank you' >>> quopri.a2b_qp=None >>> quopri.decodestring(s) # use the python version quopri.decode() to decode 'I say= a secret message\nThank you' Note that the sequence '= ' is invalid according to RFC 2045 section 6.7: ------------------------------------------------------- An "=" followed by a character that is neither a hexadecimal digit (including "abcdef") nor the CR character of a CRLF pair is illegal ... A reasonable approach by a robust implementation might be to include the "=" character and the following character in the decoded data without any transformation ------------------------------------------------------- The lenient interpretation is used by the Python version parser quopri.decode() to produce the second string. Most email clients use a similar lenient interpretation. The C version parser binascii.a2b_qp(), which is used in preference to the Python verison, produce a surprising result with the string 'a secret message' omitted. This may create an opportunity for spammers to insert secret message after '= ' so that it is not visible to Python based spam filter but woiuld display in non- Python based email client. ---------------------------------------------------------------------- >Comment By: Georg Brandl (gbrandl) Date: 2006-11-16 17:09 Message: Logged In: YES user_id=849994 Originator: NO Thanks for the report, this is now fixed in rev. 52765, 52766 (2.5). ---------------------------------------------------------------------- Comment By: Wai Yip Tung (tungwaiyip) Date: 2006-10-31 21:18 Message: Logged In: YES user_id=561546 The problem may come from binascii_a2b_qp() in binascii.c. It considers the '= ' or '=\t' sequence as a soft line break. Such interpretation appears to have no basis. It could be an misinterpretation of RFC 2045: ------------------------------------------------------------------- In particular, an "=" at the end of an encoded line, indicating a soft line break (see rule #5) may follow one or more TAB (HT) or SPACE characters. ------------------------------------------------------------------- This passage reminds readers they might find TAB or SPACE before an "=", but not after it. "= " is plain illegal as far as I know. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1588217&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com