[issue46011] Python 3.10 email returns invalid Date: header unchanged.
Mark Sapiro added the comment: Upon further research I realized this is related to https://bugs.python.org/issue30681 and that while there are no message.defects the Date: header does have the InvalidDateDefect and its datetime attribute is None so I consider this resolved. -- stage: -> resolved ___ Python tracker <https://bugs.python.org/issue46011> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46011] Python 3.10 email returns invalid Date: header unchanged.
New submission from Mark Sapiro : Here is an interactive Python session ``` Python 3.10.1 (main, Dec 7 2021, 15:44:39) [GCC 9.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> from email import message_from_bytes, policy >>> msg_raw = b"""Return-Path: ... Delivered-To: mailman-us...@dinsdale.python.org ... From: u...@example.com ... Message-Id: ... Date: Tue, 30 Nov 1999 23:56:33 -3000 (CST) ... To: mailman-us...@python.org ... ... msg1 ... """ >>> message = message_from_bytes(msg_raw, policy=policy.default) >>> message.get('date') 'Tue, 30 Nov 1999 23:56:33 -3000 (CST)' >>> message.defects [] >>> ``` The same session in Python 3.9 throws ValueError: offset must be a timedelta strictly between -timedelta(hours=24) and timedelta(hours=24), not datetime.timedelta(days=-2, seconds=64800). At first I thought this was related to https://bugs.python.org/issue30681 but that seems to not be the case as utils.parsedate_to_datetime('Tue, 30 Nov 1999 23:56:33 -3000 (CST)') throws the same exception In Python 3.10.1. I think getting the Date: header which has an invalid timezone should either throw the exception as before or return None, but not return the invalid date header. -- components: email keywords: 3.10regression messages: 407997 nosy: barry, msapiro, r.david.murray priority: normal severity: normal status: open title: Python 3.10 email returns invalid Date: header unchanged. versions: Python 3.10 ___ Python tracker <https://bugs.python.org/issue46011> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45921] codecs module doesn't support iso-8859-6-i, iso-8859-6-e, iso-8859-8-i or iso-8859-8-i
Mark Sapiro added the comment: The mailman-us...@python.org list received a post with the From: header containing a Hebrew display name RFC 2047 encoded with the iso-8859-8-i charset which threw a LookupError: unknown encoding: iso-8859-8-i exception in processing and shunted the message. The message body also had the charset declared as iso-8859-8-i although it contained only ascii. Unfortunately, I don't have the original message so I can't say what MUA created it or how common this usage is. I do think that just adding these as aliases for the non-annotated encodings is an appropriate response. -- ___ Python tracker <https://bugs.python.org/issue45921> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45921] codecs module doesn't support iso-8859-6-i, iso-8859-6-e, iso-8859-8-i or iso-8859-8-i
New submission from Mark Sapiro : iso-8859-6-i, iso-8859-6-e, iso-8859-8-i and iso-8859-8-i are all IANA recognized character sets per https://www.iana.org/assignments/character-sets/character-sets.xhtml. These are all unrecognized by codecs.lookup(). -- components: Library (Lib) messages: 407240 nosy: msapiro priority: normal severity: normal status: open title: codecs module doesn't support iso-8859-6-i, iso-8859-6-e, iso-8859-8-i or iso-8859-8-i type: behavior versions: Python 3.10, Python 3.6, Python 3.7, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue45921> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44560] Unrecognized charset "eucgb2312_cn" in email header for many MUA
Change by Mark Sapiro : -- versions: +Python 3.7, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue44560> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43996] Doc for mutable sequence pop() method implies argument is a slice or sequence.
Mark Sapiro added the comment: Thank you for the explanation which I understand and accept. I also fully (or maybe not quite fully) understand the use of square brackets to indicate optional arguments. It's just that in the context of the table at https://docs.python.org/3/library/stdtypes.html#mutable-sequence-types every other use of square brackets indicates a list or a slice and that's what confused me. Granted, all the other square bracket usage was not around a method argument, and I accept that the doc is correct, but I still found it confusing. -- ___ Python tracker <https://bugs.python.org/issue43996> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43996] Doc for mutable sequence pop() method implies argument is a slice or sequence.
New submission from Mark Sapiro : In several places in the documentation including: ``` grep -rn 'pop.\[i\]' Lib/pydoc_data/topics.py:13184: '| "s.pop([i])" | retrieves the item at *i* ' Lib/pydoc_data/topics.py:13647: '| "s.pop([i])" | retrieves the item at ' Doc/tutorial/datastructures.rst:47:.. method:: list.pop([i]) Doc/library/array.rst:193:.. method:: array.pop([i]) Doc/library/stdtypes.rst:1116:| ``s.pop([i])`` | retrieves the item at *i* and | \(2)| ``` the mutable sequence and array `pop()` method is documented as shown above in a way that implies the argument to `pop()` is a slice or sequence when it is actually just an integer. All those references should be `pop(i)` rather than `pop([i])`. -- assignee: docs@python components: Documentation messages: 392551 nosy: docs@python, msapiro priority: normal severity: normal status: open title: Doc for mutable sequence pop() method implies argument is a slice or sequence. type: behavior versions: Python 3.10, Python 3.11, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue43996> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42054] email message get_content throws KeyError for content main types font and model
New submission from Mark Sapiro : With Policy = email.policy.default, there are handlers for get_content() only for content types 'text', 'audio', 'image', 'video', 'application', 'message/rfc822', 'message/external-body' and 'message'. While these are the only main types listed in RFC 6838, RFC 8081 adds 'font' and RFC 2077 defines 'model' there are several registered 'font' and 'model' types at https://www.iana.org/assignments/media-types/media-types.xhtml It would be good if get_content() returned content, even if only raw bytes, for those types. -- messages: 378738 nosy: msapiro priority: normal severity: normal status: open title: email message get_content throws KeyError for content main types font and model versions: Python 3.10, Python 3.6, Python 3.7, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue42054> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27321] Email parser creates a message object that can't be flattened
Mark Sapiro added the comment: I work around it with ``` class Message(email.message.Message): def as_string(self): # Work around for https://bugs.python.org/issue27321 and # https://bugs.python.org/issue32330. try: value = email.message.Message.as_string(self) except (KeyError, LookupError, UnicodeEncodeError): value = email.message.Message.as_bytes(self).decode( 'ascii', 'replace') # Also ensure no unicode surrogates in the returned string. return email.utils._sanitize(value) ``` This is easy for me because it's Mailman which already subclasses email.message.Message for other reasons. It is perhaps more difficult if you aren't already subclassing email.message.Message for other purposes. -- ___ Python tracker <https://bugs.python.org/issue27321> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40597] generated email message exceeds RFC-mandated limit of 998 characters
Change by Mark Sapiro : -- pull_requests: +19786 stage: resolved -> patch review pull_request: https://github.com/python/cpython/pull/20542 ___ Python tracker <https://bugs.python.org/issue40597> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40597] generated email message exceeds RFC-mandated limit of 998 characters
Mark Sapiro added the comment: With the fix in PR 20038, committed at https://github.com/python/cpython/commit/6f2f475d5a2cd7675dce844f3af436ba919ef92b it is no longer possible to set_content(''). Attempts to do so produce the following ``` File "/var/MM/3/hk_39/hyperkitty/.tox/py39-django30/lib/python3.9/site-packages/django_mailman3/lib/scrub.py", line 95, in _get_all_attachments part.set_content('') File "/usr/local/lib/python3.9/email/message.py", line 1171, in set_content super().set_content(*args, **kw) File "/usr/local/lib/python3.9/email/message.py", line 1101, in set_content content_manager.set_content(self, *args, **kw) File "/usr/local/lib/python3.9/email/contentmanager.py", line 37, in set_content handler(msg, obj, *args, **kw) File "/usr/local/lib/python3.9/email/contentmanager.py", line 185, in set_text_content cte, payload = _encode_text(string, charset, cte, msg.policy) File "/usr/local/lib/python3.9/email/contentmanager.py", line 149, in _encode_text if max(len(x) for x in lines) <= policy.max_line_length: ValueError: max() arg is an empty sequence ``` -- nosy: +msapiro ___ Python tracker <https://bugs.python.org/issue40597> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39384] Email parser creates a message object that can't be flattened as bytes.
Mark Sapiro added the comment: I've researched this further, and I know how this happens. The original message contains a text/html part (in my case, the only part) which contains a base64 or quoted-printable body which when decoded contains non-ascii. It is parsed correctly by email.message_from_bytes. It is then processed by Mailman's content filtering which retrieves html payload via part.get_payload(decode=True).decode(ctype, errors='replace')) where part is the text/html part and ctype is 'utf-8' in this case. It then uses elinks, lynx or some other configured command to convert the html payload to plain text and that plain text still contains non-ascii. It then replaces the payload and sets the content type via del part['content-transfer-encoding'] part.set_payload(plain_text) part.set_type('text/plain') And this results in a message which can't be flattened as_bytes. The issue is set_payload() should encode the payload appropriately and in fact, it does if an appropriate charset is given, so this is our error in not providing a charset= argument to set_payload. Closing this and the corresponding PR. -- stage: patch review -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue39384> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39384] Email parser creates a message object that can't be flattened as bytes.
Mark Sapiro added the comment: Other Mailman3 installations are also encountering this issue. See https://lists.mailman3.org/archives/list/mailman-us...@mailman3.org/message/VQZORIDL5PNQ4W33KIMVTFTANSGZD46S/ -- ___ Python tracker <https://bugs.python.org/issue39384> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39384] Email parser creates a message object that can't be flattened as bytes.
Mark Sapiro added the comment: This came about because of an actual situation in a Mailman 3 installation. I can't say for sure what the actual original message looked like, but it was received by Mailman's LMTP server and parsed with email.message_from_bytes(), so it clearly wasn't exactly like the message excerpt I posted in the report above. However, All I had to go by was the message object from the shunted pickle file created as a result of the exception. The message was processed by Mailman, but when Mailman's handler pipeline attempted to save it for the digest, it calls an instance of mailbox.MMDF to add the message to the mailbox accumulating messages for the digest, and that in turn calls the flatten method of an email.generator.BytesGenerator instance. and that's where the exception was thrown. Perhaps the suggested patch in https://github.com/python/cpython/pull/18056 doesn't address every possible case, and it can result in a slightly garbled message due to replacing 'invalid' characters, but in my case at least, it is much preferable to the alternative. -- ___ Python tracker <https://bugs.python.org/issue39384> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27321] Email parser creates a message object that can't be flattened
Change by Mark Sapiro : -- pull_requests: +17467 pull_request: https://github.com/python/cpython/pull/18074 ___ Python tracker <https://bugs.python.org/issue27321> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32330] Email parser creates a message object that can't be flattened
Change by Mark Sapiro : -- keywords: +patch pull_requests: +17453 stage: -> patch review pull_request: https://github.com/python/cpython/pull/18059 ___ Python tracker <https://bugs.python.org/issue32330> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32330] Email parser creates a message object that can't be flattened
Change by Mark Sapiro : -- versions: +Python 3.7, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue32330> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39384] Email parser creates a message object that can't be flattened as bytes.
Change by Mark Sapiro : -- versions: +Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue39384> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39384] Email parser creates a message object that can't be flattened as bytes.
New submission from Mark Sapiro : This is similar to https://bugs.python.org/issue32330 but is the opposite behavior. In that issue, the message couldn't be flattened as a string but could be flattened as bytes. Here, the message can be flattened as a string but can't be flattened as bytes. The original message was created by an arguably defective email client that quoted a message containing a utf8 encoded RIGHT SINGLE QUOTATION MARK and utf-8 encoded separately the three bytes resulting in `â**` instead of `’`. That's not really relevant but is just to show how such a message can be generated. The following interactive python session shows the issue. ``` >>> import email >>> msg = email.message_from_string("""From u...@example.com Sat Jan 18 >>> 04:09:40 2020 ... From: u...@example.com ... To: re...@example.com ... Subject: Century Dates for Insurance purposes ... Date: Fri, 17 Jan 2020 20:09:26 -0800 ... Message-ID: <75ccdd72-d71c-407c-96bd-0ca95abcf...@email.android.com> ... MIME-Version: 1.0 ... Content-Type: text/plain; charset="utf-8" ... Content-Transfer-Encoding: 8bit ... ...Thursday-Monday will cover both days of staging and then storing goods ...post-century. I think thatâ**s the way to go. ... ... """) >>> msg.as_bytes() Traceback (most recent call last): File "", line 1, in File "/usr/local/lib/python3.7/email/message.py", line 178, in as_bytes g.flatten(self, unixfrom=unixfrom) File "/usr/local/lib/python3.7/email/generator.py", line 116, in flatten self._write(msg) File "/usr/local/lib/python3.7/email/generator.py", line 181, in _write self._dispatch(msg) File "/usr/local/lib/python3.7/email/generator.py", line 214, in _dispatch meth(msg) File "/usr/local/lib/python3.7/email/generator.py", line 432, in _handle_text super(BytesGenerator,self)._handle_text(msg) File "/usr/local/lib/python3.7/email/generator.py", line 249, in _handle_text self._write_lines(payload) File "/usr/local/lib/python3.7/email/generator.py", line 155, in _write_lines self.write(line) File "/usr/local/lib/python3.7/email/generator.py", line 406, in write self._fp.write(s.encode('ascii', 'surrogateescape')) UnicodeEncodeError: 'ascii' codec can't encode character '\xe2' in position 33: ordinal not in range(128) >>> ``` -- components: email messages: 360249 nosy: barry, msapiro, r.david.murray priority: normal severity: normal status: open title: Email parser creates a message object that can't be flattened as bytes. versions: Python 3.5, Python 3.6, Python 3.7 ___ Python tracker <https://bugs.python.org/issue39384> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37919] nntplib throws spurious NNTPProtocolError
New submission from Mark Sapiro : This is really due to an nntp server bug, but here's the scenerio. A connection is opened to the server. An article is posted via the connection's post() method. The server responds to the article data with 240 Article posted but due to the server bug, if the message-id is long, this response comes on two lines as 240 Article posted The post() method reads only the first line and returns it. Then the connection's quit() method (or some other method) is called, and it sees the second line of the prior response as the server's response rather than the actual response, and raises NNTPProtocolError. Arguably, NNTPProtocolError is appropriate in this scenario, but if so, it should be raised by the post() method and not by a subsequent method. -- components: Library (Lib) messages: 350214 nosy: msapiro priority: normal severity: normal status: open title: nntplib throws spurious NNTPProtocolError versions: Python 2.7, Python 3.5, Python 3.6, Python 3.7, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue37919> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36910] Certain Malformed email causes email.parser to throw AttributeError
Mark Sapiro added the comment: I do intend to submit a PR. I haven't yet worked it out though. -- ___ Python tracker <https://bugs.python.org/issue36910> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36910] Certain Malformed email causes email.parser to throw AttributeError
New submission from Mark Sapiro : The code in the attached parse_bug.py file when run with Python 3.5, 3.6 or 3.7 throws AttributeError with this traceback: ``` Traceback (most recent call last): File "parse_bug.py", line 9, in """) File "/usr/local/lib/python3.7/email/parser.py", line 124, in parsebytes return self.parser.parsestr(text, headersonly) File "/usr/local/lib/python3.7/email/parser.py", line 68, in parsestr return self.parse(StringIO(text), headersonly=headersonly) File "/usr/local/lib/python3.7/email/parser.py", line 58, in parse return feedparser.close() File "/usr/local/lib/python3.7/email/feedparser.py", line 187, in close self._call_parse() File "/usr/local/lib/python3.7/email/feedparser.py", line 180, in _call_parse self._parse() File "/usr/local/lib/python3.7/email/feedparser.py", line 323, in _parsegen if (self._cur.get('content-transfer-encoding', '8bit').lower() AttributeError: 'Header' object has no attribute 'lower' ``` The triggering condition appears to be the Content-Transfer-Encoding: header with a non-ascii character in the headers of a multipart part. The parser should probably throw email.errors.HeaderParseError with a MalformedHeaderDefect in this case rather than AttributeError. While arguably code should defend against unanticipated exceptions, the fact that such an exception can be thrown while parsing an arbitrary message could be considered a security issue. -- components: email files: parse_bug.py messages: 342415 nosy: barry, msapiro, r.david.murray priority: normal severity: normal status: open title: Certain Malformed email causes email.parser to throw AttributeError type: behavior versions: Python 3.5, Python 3.6, Python 3.7 Added file: https://bugs.python.org/file48330/parse_bug.py ___ Python tracker <https://bugs.python.org/issue36910> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34155] email.utils.parseaddr mistakenly parse an email
Mark Sapiro added the comment: I agree that my example with an @ in the 'display name', although actually seen in the wild, is non-compliant, and that the behavior of parseaddr() in this case is not a bug. Sorry for the noise. -- ___ Python tracker <https://bugs.python.org/issue34155> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34155] email.utils.parseaddr mistakenly parse an email
Mark Sapiro added the comment: The issue is illustrated much more simply as follows: email.utils.parseaddr('John Doe j...@example.com ') returns ('', 'John Doe j...@example.com') whereas it should return ('John Doe j...@example.com', 'ot...@example.net') I'll look at developing a patch. -- nosy: +msapiro ___ Python tracker <https://bugs.python.org/issue34155> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32330] Email parser creates a message object that can't be flattened
Mark Sapiro <m...@msapiro.net> added the comment: > I do wonder where you are using the string version of messages :) Probably some places where we could use bytes, but one of the problem areas is where we save the content of a message held for moderation. -- ___ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue32330> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32330] Email parser creates a message object that can't be flattened
Mark Sapiro <m...@msapiro.net> added the comment: Yes. I think errors=replace is a good solution. In Mailman, we have our own mailman.email.message.Message class which is a subclass of email.message.Message and what we do to work around this and issue27321 is override as_string() with: def as_string(self): # Work around for https://bugs.python.org/issue27321 and # https://bugs.python.org/issue32330. try: value = email.message.Message.as_string(self) except (KeyError, UnicodeEncodeError): value = email.message.Message.as_bytes(self).decode( 'ascii', 'replace') return value -- ___ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue32330> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32330] Email parser creates a message object that can't be flattened
New submission from Mark Sapiro <m...@msapiro.net>: This is related to https://bugs.python.org/issue27321 but a different exception is thrown for a different reason. This is caused by a defective spam message. I don't actually have the offending message from the wild, but the attached bad_email_2.eml illustrates the problem. The defect is the message declares the content charset as us-ascii, but the body contains non-ascii. When the message is parsed into an email.message.Message object and the objects as_string() method is called, UnicodeEncodeError is thrown as follows: >>> import email >>> with open('bad_email_2.eml', 'rb') as fp: ... msg = email.message_from_binary_file(fp) ... >>> msg.as_string() Traceback (most recent call last): File "", line 1, in File "/usr/lib/python3.5/email/message.py", line 159, in as_string g.flatten(self, unixfrom=unixfrom) File "/usr/lib/python3.5/email/generator.py", line 115, in flatten self._write(msg) File "/usr/lib/python3.5/email/generator.py", line 181, in _write self._dispatch(msg) File "/usr/lib/python3.5/email/generator.py", line 214, in _dispatch meth(msg) File "/usr/lib/python3.5/email/generator.py", line 243, in _handle_text msg.set_payload(payload, charset) File "/usr/lib/python3.5/email/message.py", line 316, in set_payload payload = payload.encode(charset.output_charset) UnicodeEncodeError: 'ascii' codec can't encode characters in position 31-33: ordinal not in range(128) -- components: email files: bad_email_2.eml messages: 308353 nosy: barry, msapiro, r.david.murray priority: normal severity: normal status: open title: Email parser creates a message object that can't be flattened type: behavior versions: Python 3.5, Python 3.6 Added file: https://bugs.python.org/file47333/bad_email_2.eml ___ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue32330> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32144] email.policy.SMTP and SMTPUTF8 doesn't honor linesep's value
Change by Mark Sapiro <m...@msapiro.net>: -- nosy: +msapiro ___ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue32144> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27321] Email parser creates a message object that can't be flattened
Mark Sapiro added the comment: It looks like Johannes beat me to it. Thanks for that, but see my comments in the diff at https://github.com/kyrias/cpython/commit/a986a8274a522c73d87360da6930e632a3eb4ebb -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue27321> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27321] Email parser creates a message object that can't be flattened
Mark Sapiro added the comment: I considered look before you leap, but I decided since we're munging the headers anyway, preserving their order is not that critical, but the patch is easy enough. I'll work on that and a test. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue27321> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27321] Email parser creates a message object that can't be flattened
Mark Sapiro added the comment: One additional observation. The original message contained no Content-Transfer-Encoding header even though the message body was raw koi8-r characters. Adding Content-Transfer-Encoding: 8bit to the message headers avoids the issue, but that is not a practical solution as the message was Russian spam received by a Mailman list and the resultant KeyError caused problems in Mailman. We can work on defending against this in Mailman, but I suggest that the munge_cte code in generator._write() avoid the documented possible KeyError raised by replace_header() by using __delitem__() and __setitem__() instead as in the attached generator.patch. -- keywords: +patch Added file: http://bugs.python.org/file43394/generator.patch ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue27321> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27321] Email parser creates a message object that can't be flattened
New submission from Mark Sapiro: The attached file, bad_email, can be parsed via msg = email.message_from_binary_file(open('bad_email', 'rb')) but then msg.as_string() prodices the following: Traceback (most recent call last): File "", line 1, in File "/usr/lib/python3.5/email/message.py", line 159, in as_string g.flatten(self, unixfrom=unixfrom) File "/usr/lib/python3.5/email/generator.py", line 115, in flatten self._write(msg) File "/usr/lib/python3.5/email/generator.py", line 189, in _write msg.replace_header('content-transfer-encoding', munge_cte[0]) File "/usr/lib/python3.5/email/message.py", line 559, in replace_header raise KeyError(_name) KeyError: 'content-transfer-encoding' -- components: email files: bad_email messages: 268580 nosy: barry, msapiro, r.david.murray priority: normal severity: normal status: open title: Email parser creates a message object that can't be flattened versions: Python 3.4, Python 3.5 Added file: http://bugs.python.org/file43391/bad_email ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue27321> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26686] email.parser stops parsing headers too soon.
Mark Sapiro added the comment: Added Python 2.7 to versions: -- versions: +Python 2.7 ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue26686> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26686] email.parser stops parsing headers too soon.
New submission from Mark Sapiro: Given an admittedly defective (the folded Content-Type: isn't indented) message part with the following headers/body --- Content-Disposition: inline; filename="04EBD_._A546BB.zip" Content-Type: application/x-rar-compressed; x-unix-mode=0600; name="04EBD_._A546BB.zip" Content-Transfer-Encoding: base64 UmFyIRoHAM+QcwAADQBKRXQgkC4ApAMAAEAHAAACJLrQXYFUfkgdMwkAIGEw ZjEwZi5qcwDwrrI/DB2NDI0TzcGb3Gpb8HzsS0UlpwELvdyWnVaBQt7Sl2zbJpx1qqFCGGk6 ... --- email.parser parses the headers as --- Content-Disposition: inline; filename="04EBD_._A546BB.zip" Content-Type: application/x-rar-compressed; x-unix-mode=0600; --- and the body as --- name="04EBD_._A546BB.zip" Content-Transfer-Encoding: base64 UmFyIRoHAM+QcwAADQBKRXQgkC4ApAMAAEAHAAACJLrQXYFUfkgdMwkAIGEw ZjEwZi5qcwDwrrI/DB2NDI0TzcGb3Gpb8HzsS0UlpwELvdyWnVaBQt7Sl2zbJpx1qqFCGGk6 ... --- and shows no defects. This is wrong. RFC5322 section 2.1 is clear that everything up to the first empty line is headers. Even the docstring in the email/parser.py module says "The header block is terminated either by the end of the string or by a blank line." Since the message is defective, it isn't clear what the correct result should be, but I think Headers: Content-Disposition: inline; filename="04EBD_._A546BB.zip" Content-Type: application/x-rar-compressed; x-unix-mode=0600; Content-Transfer-Encoding: base64 Body: UmFyIRoHAM+QcwAADQBKRXQgkC4ApAMAAEAHAAACJLrQXYFUfkgdMwkAIGEw ZjEwZi5qcwDwrrI/DB2NDI0TzcGb3Gpb8HzsS0UlpwELvdyWnVaBQt7Sl2zbJpx1qqFCGGk6 ... Defects: name="04EBD_._A546BB.zip" would be more appropriate. The problem is that the Content-Transfer-Encoding: base64 header is not in the headers so that get_payload(decode=True) doesn't decode the base64 encoded body making malware recognition difficult. -- components: Library (Lib) messages: 262750 nosy: msapiro priority: normal severity: normal status: open title: email.parser stops parsing headers too soon. type: behavior versions: Python 3.4 ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue26686> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1409460] email.Utils.parseaddr() gives arcane result
Mark Sapiro m...@msapiro.net added the comment: parsing 'merwok' expected ('merwok', '') got ('', 'merwok') I think ('', 'merwok') is the correct result. I think most if not all MUAs/MTAs will interpret an address without an '@', albeit invalid, as a local-part in the local domain, thus parsing 'merwok' as the address 'merwok' with no real name is probably the right thing to do with this input. The alternative would be to return ('', '') indicating failure. parsing 'merwok w...@rusty' expected ('', 'w...@rusty') got ('', 'merwok...@rusty') Here, I think failure is a more appropriate return. In any case, I think this is a new bug deserving of a new report. It is not really relevant to this issue which has to do with nested parentheses. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1409460 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5713] smtplib gets out of sync if server returns a 421 status
Mark Sapiro m...@msapiro.net added the comment: I'm not completely sure about this, but here's my thoughts. In the scenarios I've seen, the 421 reply/disconnect only occurs in response to a RCPT which has an invalid address and follows several prior refused RCPTs. In this case, I think the proper action is to close the connection and raise SMTPRecipientsRefused and return a dictionary with the actual responses for the refused RCPTS prior to the 421 and the 421 response only for the RCPT that produced it. If the 421 comes at another time, I think the current process does the right thing. It will raise the appropriate exception if it gets the chance. It just needs to be sure that if the response was 421 that instead of doing self.rset() it does self.close(). I have attached a patch against the 2.6.1 smtplib.py which I think does the right thing. I haven't tested this at all, but I think it should work. The documentation may need to be updated to emphasize that even though all recipients aren't listed in the dictionary returned with the SMTPRecipientsRefused exception, no one got the mail. -- keywords: +patch Added file: http://bugs.python.org/file15183/smtplib.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5713 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5713] smtplib gets out of sync if server returns a 421 status
New submission from Mark Sapiro m...@msapiro.net: RFC821 upon which smtplib was originally based does not define a 421 status code and implies the server should only disconnect in response to a QUIT command. Subsequent extensions in RFC2821 (and now RFC5321) define situations under which the server may return a 421 status and disconnect. This leads to the following problem. An smtplib.SMTP() instance is created and its sendmail() method is called with a list of recipients which contains several invalid, local addresses. sendmail() processes the recipient list, calling the rcpt() method for each. Some of these may be accepted with a 250 or 251 status and some may be rejected with a 550 or other status. The rejects are kept in a dictionary to be eventually returned as the sendmail() result. However, with the Postfix server at least, after 20 rejects, the server sends a 421 Too many errors reply and disconnects, but sendmail continues to process and this results in raising SMTPServerDisconnected(Connection unexpectedly closed) and the response dictionary containing the invalid addresses and their responses is lost. The caller may see the exception as retryable and may retry the send after some delay, but since the caller has received no information about the invalid addresses, it sends the same recipient list and the scenario repeats. -- components: Library (Lib) messages: 85666 nosy: msapiro severity: normal status: open title: smtplib gets out of sync if server returns a 421 status type: behavior versions: Python 2.4, Python 2.5, Python 2.6 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5713 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5277] email message.get_params() and related methods sometimes fail.
New submission from Mark Sapiro m...@msapiro.net: The message method get_params() and the related get_param() and get_filename() do not properly decode an RFC 2231 encoded parameter such as the following: Content-Disposition: inline; filename*0=Re: [Mailman-Users] Messages shunted with \TypeError: ; filename*1=decodingUnicode is not supported\.eml This is because the message helper function _parseparams() mistakenly thinks the second semicolon is inside a quoted string because it counts the quoted (escaped) quote and sees an odd number. The attached patch will fix this. -- components: Library (Lib) files: message.patch keywords: patch messages: 82215 nosy: barry, msapiro severity: normal status: open title: email message.get_params() and related methods sometimes fail. type: behavior versions: Python 2.4, Python 2.5, Python 2.6, Python 3.0, Python 3.1 Added file: http://bugs.python.org/file13105/message.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5277 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4279] Module 'parser' fails to build
Mark Sapiro m...@msapiro.net added the comment: This problem also occurs when building the 2.6.1 parser module on Cygwin 1.5.25. It did not occur with Python 2.6 or 2.5.x. The error from 'make' is building 'parser' extension gcc -shared -Wl,--enable-auto-image-base build/temp.cygwin-1.5.25-i686-2.6/cygdrive/c/Python_dist/Python-2.6.1/Modules/parsermodule.o -L/usr/local/lib -L. -lpython2.6 -o build/lib.cygwin-1.5.25-i686-2.6/parser.dll build/temp.cygwin-1.5.25-i686-2.6/cygdrive/c/Python_dist/Python-2.6.1/Modules/parsermodule.o: In function `parser_expr': /cygdrive/c/Python_dist/Python-2.6.1/Modules/parsermodule.c:552: undefined reference to `__PyParser_Grammar' build/temp.cygwin-1.5.25-i686-2.6/cygdrive/c/Python_dist/Python-2.6.1/Modules/parsermodule.o: In function `parser_suite': /cygdrive/c/Python_dist/Python-2.6.1/Modules/parsermodule.c:552: undefined reference to `__PyParser_Grammar' collect2: ld returned 1 exit status I was able to work around the error and build a parser module that passed unit test by manually running gcc -shared -Wl,--enable-auto-image-base build/temp.cygwin-1.5.25-i686-2.6/cygdrive/c/Python_dist/Python-2.6.1/Modules/parsermodule.o Python/graminit.o -L/usr/local/lib -L. -lpython2.6 -o build/lib.cygwin-1.5.25-i686-2.6/parser.dll i.e. by including Python/graminit.o in the explicit object files to load. I have also confirmed that applying the parser-grammar.patch from #4288 will allow make to successfully build a parser module that passes unit tests. -- nosy: +msapiro versions: +Python 2.6 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4279 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4288] parsermodule and grammar variable
Changes by Mark Sapiro m...@msapiro.net: -- nosy: +msapiro ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4288 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4789] Documentation changes break existing URIs
Mark Sapiro m...@msapiro.net added the comment: Thank you for adding the redirects, and for getting them right in spite of my garbling some of them in the original report. I have updated the links for the next Mailman release. ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4789 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4789] Documentation changes break existing URIs
New submission from Mark Sapiro m...@msapiro.net: The Mailman GUI contains a few links to the python.org documentation which are now broken. These links and the current equivalents are: http://www.python.org/doc/ works, but could map to http://docs.python.org/ http://www.python.org/doc/current/ works, but could map to http://docs.python.org/ http://www.python.org/doc/current/lib/ - http://docs.python.org/library/ http://www.python.org/doc/current/lib/module-re.htm - http://docs.python.org/library/re.html http://www.python.org/doc/current/lib/re-syntax - http://docs.python.org/library/re.html#regular-expression-syntax http://www.python.org/doc/current/lib/typesseq-strings.html - http://docs.python.org/library/stdtypes.html#string-formatting-operations It would be really cool if these old URIs could redirect to the new ones. -- assignee: georg.brandl components: Documentation messages: 78583 nosy: barry, georg.brandl, msapiro severity: normal status: open title: Documentation changes break existing URIs type: behavior versions: Python 2.6 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4789 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com