[jira] Commented: (MIME4J-58) Lenient dealing with headless messages or malformed header/body separation

Oleg Kalnichevski (JIRA) Thu, 31 Dec 2009 02:04:00 -0800

    [ 
https://issues.apache.org/jira/browse/MIME4J-58?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12795576#action_12795576
 ]


Oleg Kalnichevski commented on MIME4J-58:
-----------------------------------------

Why do not you just copy the content of temp buffer to the internal buffer? 
This could potentially eliminate the need to maintain two buffers internally.

Oleg

> Lenient dealing with headless messages or malformed header/body separation
> --------------------------------------------------------------------------
>
>                 Key: MIME4J-58
>                 URL: https://issues.apache.org/jira/browse/MIME4J-58
>             Project: JAMES Mime4j
>          Issue Type: Task
>    Affects Versions: 0.3
>            Reporter: Stefano Bagnara
>             Fix For: 0.8
>
>         Attachments: headerbody-nocrlfcrlf.msg, headerbody-noheader.msg
>
>
> Define how to deal with non canonical messages like this one:
> -----------------------
> This is a simple message not having headers.
> The whole text should be recognized as body.
> -----------------------
> or this one:
> -----------------------
> Subject: this is a subject
> This is an invalid header
> AnotherHeader: is this an header or the first part of the body?
> Body text
> -----------------------
> In the first case mime4j output twice an  "invalid header" error and a 
> roundtrip write result in an empty message.
> In the SMTP case this is unfortunate because sometimes it happens messages 
> are sent without header.
> In the second case mime4j currenlty take Subject and AnotherHeader as headers 
> and "This is an invalid header" raise a monitor for "invalid header" and 
> "Body text" is considered the body.
> A compromise we evaluated in past between compliance, leniency and performace 
> was to "alter" the requirement for CRLFCRLF between headers and body with a 
> different rule: if during parsing of the headers we find a line (not 
> multiline) and not including an "HeaderName: something" then we virtually add 
> a CRLF *before* that line and consider that line the first line of the body. 
> This allow us to only buffer a single line (as opposite to parsing the whole 
> message in search of a CRLFCRLF and consider the full message a body if no 
> CRLFCRLF is found) and to be very lenient with input. The "side effect" 
> (maybe not bad) is that a wrong header in the middle of headers will result 
> in some headers moved to the body.
> With this algorythm the above would be "virtually" parsed as it was:
> -----------------------
> This is a simple message not having headers.
> The whole text should be recognized as body.
> -----------------------
> or this one:
> -----------------------
> Subject: this is a subject
> This is an invalid header
> AnotherHeader: is this an header or the first part of the body?
> Body text
> -----------------------
> If we think about strict and lenient approaches I think that current mime4j 
> result is ok when using a strict parsing, while the one I propose is a good 
> lenient alternative.
> Opinions? Alternatives?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MIME4J-58) Lenient dealing with headless messages or malformed header/body separation

Reply via email to