[issue25728] email parser ignores inner multipart boundary when outer message duplicates it

2015-11-25 Thread R. David Murray
R. David Murray added the comment: I am open to (and will review) a patch that applies simple heuristics to trying to guess correctly about such messages, but only if it doesn't add too much complexity to the parser. I'm not certain I would consider it for a bug fix release, but I'll postpone

[issue25728] email parser ignores inner multipart boundary when outer message duplicates it

2015-11-24 Thread Forest
Forest added the comment: > The library can't successfully parse such a message It could successfully parse such a message, if it matched against inner message boundaries before outer message boundaries. (One implementation would be to keep a list of all ancestor boundaries and traverse the l

[issue25728] email parser ignores inner multipart boundary when outer message duplicates it

2015-11-24 Thread Forest
Forest added the comment: RFC 2046 says that the outer message is defective, since it uses a boundary delimiter that is quite obviously present inside one of the encapsulated parts: https://tools.ietf.org/html/rfc2046#section-5.1 "The boundary delimiter MUST NOT appear inside any of the encaps

[issue25728] email parser ignores inner multipart boundary when outer message duplicates it

2015-11-24 Thread R. David Murray
R. David Murray added the comment: Who is to say that the outer message is defective and not the inner one? How can a parser decide which part belongs to which message? It isn't an AI. The whole message is defective, so all bets are off :) The library can't successfully parse such a message

[issue25728] email parser ignores inner multipart boundary when outer message duplicates it

2015-11-24 Thread Forest
Forest added the comment: I thought at first that this might be deliberate behavior in order to comply with RFC 2046 section 5.1.2. https://tools.ietf.org/html/rfc2046#section-5.1.2 After carefully re-reading that section, I see that it is just making sure an outer message's boundary will stil

[issue25728] email parser ignores inner multipart boundary when outer message duplicates it

2015-11-24 Thread Forest
New submission from Forest: When a multipart message erroneously defines a boundary string that conflicts with an inner message's boundary string, the parser ignores the (correct) inner message's boundary, and treats all matching boundary lines as if they belong to the (defective) outer messag