[ https://issues.apache.org/jira/browse/MIME4J-58?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12795576#action_12795576 ]
Oleg Kalnichevski commented on MIME4J-58: ----------------------------------------- Why do not you just copy the content of temp buffer to the internal buffer? This could potentially eliminate the need to maintain two buffers internally. Oleg > Lenient dealing with headless messages or malformed header/body separation > -------------------------------------------------------------------------- > > Key: MIME4J-58 > URL: https://issues.apache.org/jira/browse/MIME4J-58 > Project: JAMES Mime4j > Issue Type: Task > Affects Versions: 0.3 > Reporter: Stefano Bagnara > Fix For: 0.8 > > Attachments: headerbody-nocrlfcrlf.msg, headerbody-noheader.msg > > > Define how to deal with non canonical messages like this one: > ----------------------- > This is a simple message not having headers. > The whole text should be recognized as body. > ----------------------- > or this one: > ----------------------- > Subject: this is a subject > This is an invalid header > AnotherHeader: is this an header or the first part of the body? > Body text > ----------------------- > In the first case mime4j output twice an "invalid header" error and a > roundtrip write result in an empty message. > In the SMTP case this is unfortunate because sometimes it happens messages > are sent without header. > In the second case mime4j currenlty take Subject and AnotherHeader as headers > and "This is an invalid header" raise a monitor for "invalid header" and > "Body text" is considered the body. > A compromise we evaluated in past between compliance, leniency and performace > was to "alter" the requirement for CRLFCRLF between headers and body with a > different rule: if during parsing of the headers we find a line (not > multiline) and not including an "HeaderName: something" then we virtually add > a CRLF *before* that line and consider that line the first line of the body. > This allow us to only buffer a single line (as opposite to parsing the whole > message in search of a CRLFCRLF and consider the full message a body if no > CRLFCRLF is found) and to be very lenient with input. The "side effect" > (maybe not bad) is that a wrong header in the middle of headers will result > in some headers moved to the body. > With this algorythm the above would be "virtually" parsed as it was: > ----------------------- > This is a simple message not having headers. > The whole text should be recognized as body. > ----------------------- > or this one: > ----------------------- > Subject: this is a subject > This is an invalid header > AnotherHeader: is this an header or the first part of the body? > Body text > ----------------------- > If we think about strict and lenient approaches I think that current mime4j > result is ok when using a strict parsing, while the one I propose is a good > lenient alternative. > Opinions? Alternatives? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.