Oleg Kalnichevski ha scritto:
On Fri, 2008-07-18 at 16:19 +0200, Stefano Bagnara wrote:
Oleg Kalnichevski ha scritto:
On Fri, 2008-07-18 at 14:45 +0200, Stefano Bagnara wrote:
Oleg Kalnichevski ha scritto:
On Fri, 2008-07-18 at 10:58 +0200, Stefano Bagnara wrote:
Oleg Kalnichevski ha scritto:
On Thu, 2008-07-17 at 20:21 +0200, Stefano Bagnara wrote:
Oleg Kalnichevski ha scritto:
Stefano Bagnara wrote:
...

As I said the strict mode would only be useful to users of mime4j wanting to use mime4j as a validator to check RFC compliance. You know, mime4j born for SMTP, but now you need it for HTTP and someone else may want to do a validator. So let's not keep our eyes closed once again.

OK, I fail to see any practical benefit of that aside from a nice warm
feeling about being 100% compliant, but I admit I am biased.

Anyways, let's talk code now. How about this?

(1)

interface LineDelimiterStrategy {

 boolean isNewLine(char ch1, char ch2) // both can be -1
        throws MimeException;

}

One can provide MimeTokenStream with an implementation of this interface
at the construction time. MimeTokenStream it its turn passes a
reference to that class to all parser components that need to deal with
line delimiters.
I'm not sure I understand what are the 2 params passed to isNewLine and what code will invoke this service.

2 consecutive characters read from the data stream or -1 if any of those
characters is not available.
so "a\r\nb" would result in the calls:
isNewLine(-1,'a');
isNewLine('a','\r');
isNewLine('\r','\n');
isNewLine('\n','b');
isNewLine('b',-1);
is this correct? What would be the result for the 5 above from the implementation that will be fine in HTTP?


Anything that allows:

line delimiter = (LF|CRLF)

I understood this, but I'm not following you on how your do this with the Interface you was proposing. Given your rule you have true on the 3rd and the 4th call? Wouldn't this result in 2 newlines?

(2) The issue of CR / LF handling in content bodies should be taken of
when formatting output, _not_ when parsing input.

Would that work for you?
I'm not sure this is enough.
In output we format what we parser: if we parsed the input as multiple lines then we output multiple lines, otherwise we output a single line. So it is during parsing that we have to decide whether an isolated LF is a newline delimiter or not.
But mime4j does not parse _content bodies_ as multiple lines, does it?
TextBody.getReader()


But that does not necessarily imply parsing into multiple lines, does
it? Anyways, I perfectly am fine with TexyBody automatically converting
line delimiters. IMHO this is the right place to do the conversion, but
not the MimeTokenStream

You are right, the Reader does not imply line parsing, but anyway somewhere we have to deal with lines. Mime4J basic classes (the whole LineReaderInputStream hierarchy) have indeed a readLine method. This just made me realize that the internal buffer is filled with lines and that sending a very long binary make mime4j die with OOM. We can fix this OOM during standard parsing by having an hard limit on the size (and throwing exception otherwise) but we have to do this differently during the streaming of "binary" encoded parts (line reading makes no sense there).

Furthermore, at the very minimum we have a RootInputStream only counting lines if they are CRLF terminated. It seems weird that we count lines only if their are CRLF terminated but we recognize them also if they are LF ending (this is one more issue to be taken in consideration, not the one we was talking about).

Stefano

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to