[ 
https://issues.apache.org/jira/browse/MIME4J-58?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12795602#action_12795602
 ] 

Stefano Bagnara commented on MIME4J-58:
---------------------------------------

I don't think that copying the buffer instead of switching it will have any 
impact in the API. It is mainly an implementation issue hidden in the 
linereader class. THe only API change is about the unread(ByteArrayBuffer) 
method contract: currently it will read the bytes directly from the provided 
bytearray so the caller can't reuse the same byte[] array, instead if we copy 
then the user can reuse the buffer (currently the only client code creates a 
field buffer for each field, so this wouldn't change the client code, anyway).

IMHO as it is "hidden" in the class and that the cycleclean solution is both 
performant and memory efficient I wouldn't spare time to change it further, but 
if you think it worth it just join the branch.

Please review also the other changes, as the goal of the branch is to be merged 
back to trunk as soon as other committers will have the time to review.

> Lenient dealing with headless messages or malformed header/body separation
> --------------------------------------------------------------------------
>
>                 Key: MIME4J-58
>                 URL: https://issues.apache.org/jira/browse/MIME4J-58
>             Project: JAMES Mime4j
>          Issue Type: Task
>    Affects Versions: 0.3
>            Reporter: Stefano Bagnara
>            Assignee: Stefano Bagnara
>             Fix For: 0.8
>
>         Attachments: headerbody-nocrlfcrlf.msg, headerbody-noheader.msg
>
>
> Define how to deal with non canonical messages like this one:
> -----------------------
> This is a simple message not having headers.
> The whole text should be recognized as body.
> -----------------------
> or this one:
> -----------------------
> Subject: this is a subject
> This is an invalid header
> AnotherHeader: is this an header or the first part of the body?
> Body text
> -----------------------
> In the first case mime4j output twice an  "invalid header" error and a 
> roundtrip write result in an empty message.
> In the SMTP case this is unfortunate because sometimes it happens messages 
> are sent without header.
> In the second case mime4j currenlty take Subject and AnotherHeader as headers 
> and "This is an invalid header" raise a monitor for "invalid header" and 
> "Body text" is considered the body.
> A compromise we evaluated in past between compliance, leniency and performace 
> was to "alter" the requirement for CRLFCRLF between headers and body with a 
> different rule: if during parsing of the headers we find a line (not 
> multiline) and not including an "HeaderName: something" then we virtually add 
> a CRLF *before* that line and consider that line the first line of the body. 
> This allow us to only buffer a single line (as opposite to parsing the whole 
> message in search of a CRLFCRLF and consider the full message a body if no 
> CRLFCRLF is found) and to be very lenient with input. The "side effect" 
> (maybe not bad) is that a wrong header in the middle of headers will result 
> in some headers moved to the body.
> With this algorythm the above would be "virtually" parsed as it was:
> -----------------------
> This is a simple message not having headers.
> The whole text should be recognized as body.
> -----------------------
> or this one:
> -----------------------
> Subject: this is a subject
> This is an invalid header
> AnotherHeader: is this an header or the first part of the body?
> Body text
> -----------------------
> If we think about strict and lenient approaches I think that current mime4j 
> result is ok when using a strict parsing, while the one I propose is a good 
> lenient alternative.
> Opinions? Alternatives?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to