bpoaugust added the comment:
Sorry, I think '' is not valid, as spaces are not allowed between
words.
However I am not seeing the original unfolded source if there is an error,
unless I am misunderstanding the API.
For example:
--- cut here ---
import email.header
import email.utils
R. David Murray added the comment:
The general idea is that the string version of the header should contain all of
the original information, but the parsed elements (the things returned by
special header attributes) will contain the valid data, if any. So if the
string version of the
bpoaugust added the comment:
I think an id of the form
should be allowed, but it generates
obs-id-left => local-part => obs-local-part => word *("." word)
word => atom => [CFWS] 1*atext [CFWS]
'' should also be allowed but generates ' (A A)'
and '' gives ' '
--
bpoaugust added the comment:
When the library is being used to parse existing emails, I think it needs to do
the minimum validation and canonicalisation.
It may be useful in some circumstances to report where the input is not
syntactically correct, but I'm not sure it is helpful to truncate
R. David Murray added the comment:
Note that the parser does attempt to accept obsolete syntax (registering
defects for it), so if there is a bug in the implementation of the obsolete
syntax handling it should be fixed. And yes, there have been other bugs with
whitespace handling in the
bpoaugust added the comment:
The easiest might be for me to provide some test cases, but I have not been
able to work out where the existing unit tests are.
One failure which I believe should be permitted under current rules is:
- i.e. trailing space
The space gets added AFTER the >
Eric V. Smith added the comment:
In what way is it too strict? What "obsolete rules" are you referring to? What
are some example Message-Ids should be considered valid that instead get
truncated? What changes are you proposing?
--
nosy: +eric.smith
New submission from bpoaugust :
The email headerregistry class MessageIDHeader is too strict when parsing
existing Message-Ids. It can truncate Message-Ids that are valid according to
the obsolete rules.
As the saying has it:
"Be liberal in what you accept, and conservative in what you