Re: [mime4j] newlines and parsing of nested (encoded) rfc822 messages

Stefano Bagnara Fri, 18 Jul 2008 10:46:32 -0700

Oleg Kalnichevski ha scritto:

Stefano Bagnara wrote:
Oleg Kalnichevski ha scritto:
On Fri, 2008-07-18 at 16:19 +0200, Stefano Bagnara wrote:
Oleg Kalnichevski ha scritto:
On Fri, 2008-07-18 at 14:45 +0200, Stefano Bagnara wrote:
Oleg Kalnichevski ha scritto:
On Fri, 2008-07-18 at 10:58 +0200, Stefano Bagnara wrote:
Oleg Kalnichevski ha scritto:
On Thu, 2008-07-17 at 20:21 +0200, Stefano Bagnara wrote:
Oleg Kalnichevski ha scritto:
Stefano Bagnara wrote:
...
As I said the strict mode would only be useful to users of mime4jwanting to use mime4j as a validator to check RFC compliance. Youknow, mime4j born for SMTP, but now you need it for HTTP andsomeone else may want to do a validator. So let's not keep oureyes closed once again.
OK, I fail to see any practical benefit of that aside from a nice warm
feeling about being 100% compliant, but I admit I am biased.
Anyways, let's talk code now. How about this?

(1)

interface LineDelimiterStrategy {

 boolean isNewLine(char ch1, char ch2) // both can be -1
    throws MimeException;

}
One can provide MimeTokenStream with an implementation of thisinterface
at the construction time. MimeTokenStream it its turn passes a
reference to that class to all parser components that need todeal with
line delimiters.
I'm not sure I understand what are the 2 params passed toisNewLine and what code will invoke this service.
2 consecutive characters read from the data stream or -1 if any ofthosecharacters is not available.
so "a\r\nb" would result in the calls:
isNewLine(-1,'a');
isNewLine('a','\r');
isNewLine('\r','\n');
isNewLine('\n','b');
isNewLine('b',-1);
is this correct? What would be the result for the 5 above from theimplementation that will be fine in HTTP?
Anything that allows:

line delimiter = (LF|CRLF)
I understood this, but I'm not following you on how your do this withthe Interface you was proposing.Given your rule you have true on the 3rd and the 4th call? Wouldn'tthis result in 2 newlines?
I do not think so, only a sequence with ch2 = '\n' would be considered avalid line delimiter. I realized, though, the problem with thisinterface is that it implied a one byte read I had thought we wanted toget rid of.


I understand it now, thank you!

(2) The issue of CR / LF handling in content bodies should betaken of
when formatting output, _not_ when parsing input.

Would that work for you?
I'm not sure this is enough.
In output we format what we parser: if we parsed the input asmultiple lines then we output multiple lines, otherwise we outputa single line. So it is during parsing that we have to decidewhether an isolated LF is a newline delimiter or not.
But mime4j does not parse _content bodies_ as multiple lines, does it?
TextBody.getReader()
But that does not necessarily imply parsing into multiple lines, does
it? Anyways, I perfectly am fine with TexyBody automatically converting
line delimiters. IMHO this is the right place to do the conversion, but
not the MimeTokenStream
You are right, the Reader does not imply line parsing, but anywaysomewhere we have to deal with lines.Mime4J basic classes (the whole LineReaderInputStream hierarchy) haveindeed a readLine method. This just made me realize that the internalbuffer is filled with lines and that sending a very long binary makemime4j die with OOM.
No, it would not. Binary content is not read line by line. The #readLinemethod is only used when parsing metadata (header fields), where we doneed to put a cap on the max line length, as discussed before.

My fault: I had code casting to LineReaderInputStream and using readLineto get the content, but the method indeed returned me only anInputStream and there is no way to throw the OOM without using a cast.

About the line length limit we really need it: a random sequence ofnon-LF chars currently make our code to throw an OOM.


Stefano

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [mime4j] newlines and parsing of nested (encoded) rfc822 messages

Reply via email to