On Fri, 2008-07-18 at 10:58 +0200, Stefano Bagnara wrote:
> Oleg Kalnichevski ha scritto:
> > On Thu, 2008-07-17 at 20:21 +0200, Stefano Bagnara wrote:
> >> Oleg Kalnichevski ha scritto:
> >>> Stefano Bagnara wrote:
> > 

...

> E.g: I'm slowly coming to a possible proposal about parsing.
> - strict mode: no conversion is done, a CR or LF in headers (or other 
> non 7bit content) make mime4j fail parsing.
> - permissive modes:
>    - default binary: no conversion happen, isolated CR and LF are 
> accepted everywhere but not considered newlines (as like as other 8bit 
> bytes), the default content-transfer-encoding is "binary" when not 
> specified (7bit, 8bit and binary are read as binary).
>    - default text: we convert isolated CR and LF to CRLF almost 
> everywhere but in "binary" content-transfer-encoding parts.
> I'm not proposing this yet (not sure this is enough and we don't need 
> more granular tweakings), but this is something I'm evaluating right 
> now... The strict mode is desiderable to have, but less important than 
> the permissive parsing (we want to be strict in output, not in input). 
> OTOH someone may want to use mime4j for validating if a content is 
> wellformed or not (wrt RFC) and in this case a strict mode would be 
> necessary.
> 
> Stefano
> 

Stefano,

With all due respect but I see strict handling of line delimiters as
_pointless_ orthodoxy that really does not help anyone. Would you really
ship an application to a client of yours that rejects a message as
invalid because it contains a lone LF in it? So what is the _point_ of
being strict about line delimiters?

Anyways, let's talk code now. How about this?

(1)

interface LineDelimiterStrategy {

 boolean isNewLine(char ch1, char ch2) // both can be -1
        throws MimeException;

}

One can provide MimeTokenStream with an implementation of this interface
at the construction time. MimeTokenStream it its turn passes a
reference to that class to all parser components that need to deal with
line delimiters.

(2) The issue of CR / LF handling in content bodies should be taken of
when formatting output, _not_ when parsing input.

Would that work for you?

Oleg


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to