Re: [mime4j] newlines and parsing of nested (encoded) rfc822 messages

Oleg Kalnichevski Fri, 18 Jul 2008 07:39:19 -0700

On Fri, 2008-07-18 at 16:19 +0200, Stefano Bagnara wrote:
> Oleg Kalnichevski ha scritto:
> > On Fri, 2008-07-18 at 14:45 +0200, Stefano Bagnara wrote:
> >> Oleg Kalnichevski ha scritto:
> >>> On Fri, 2008-07-18 at 10:58 +0200, Stefano Bagnara wrote:
> >>>> Oleg Kalnichevski ha scritto:
> >>>>> On Thu, 2008-07-17 at 20:21 +0200, Stefano Bagnara wrote:
> >>>>>> Oleg Kalnichevski ha scritto:
> >>>>>>> Stefano Bagnara wrote:
> > 
> > ...
> > 
> >> As I said the strict mode would only be useful to users of mime4j 
> >> wanting to use mime4j as a validator to check RFC compliance. You know, 
> >> mime4j born for SMTP, but now you need it for HTTP and someone else may 
> >> want to do a validator. So let's not keep our eyes closed once again.
> >>
> > 
> > OK, I fail to see any practical benefit of that aside from a nice warm
> > feeling about being 100% compliant, but I admit I am biased.
> > 
> >>> Anyways, let's talk code now. How about this?
> >>>
> >>> (1)
> >>>
> >>> interface LineDelimiterStrategy {
> >>>
> >>>  boolean isNewLine(char ch1, char ch2) // both can be -1
> >>>   throws MimeException;
> >>>
> >>> }
> >>>
> >>> One can provide MimeTokenStream with an implementation of this interface
> >>> at the construction time. MimeTokenStream it its turn passes a
> >>> reference to that class to all parser components that need to deal with
> >>> line delimiters.
> >> I'm not sure I understand what are the 2 params passed to isNewLine and 
> >> what code will invoke this service.
> >>
> > 
> > 2 consecutive characters read from the data stream or -1 if any of those
> > characters is not available. 
> 
> so "a\r\nb" would result in the calls:
> isNewLine(-1,'a');
> isNewLine('a','\r');
> isNewLine('\r','\n');
> isNewLine('\n','b');
> isNewLine('b',-1);
> is this correct? What would be the result for the 5 above from the 
> implementation that will be fine in HTTP?
>


Anything that allows:

line delimiter = (LF|CRLF)


> >>> (2) The issue of CR / LF handling in content bodies should be taken of
> >>> when formatting output, _not_ when parsing input.
> >>>
> >>> Would that work for you?
> >> I'm not sure this is enough.
> >> In output we format what we parser: if we parsed the input as multiple 
> >> lines then we output multiple lines, otherwise we output a single line. 
> >> So it is during parsing that we have to decide whether an isolated LF is 
> >> a newline delimiter or not.
> > 
> > But mime4j does not parse _content bodies_ as multiple lines, does it?
> 
> TextBody.getReader()
> 

But that does not necessarily imply parsing into multiple lines, does
it? Anyways, I perfectly am fine with TexyBody automatically converting
line delimiters. IMHO this is the right place to do the conversion, but
not the MimeTokenStream

> > At this point I think I have to give up. Whatever you end up doing
> > _please_ do not wrap the raw data stream with EOLConvertingInputStream.
> 
> Sure, I already excluded this: I now understand the "C-T-E: binary" issue.
> BTW I hope you will keep monitoring this issue so you can confirm 
> whatever solution we propose will be fine with your library?
> 

Sure.

Oleg


> Thank you,
> Stefano
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [mime4j] newlines and parsing of nested (encoded) rfc822 messages

Reply via email to