Re: APPEND vs RFC2822 vs STD0011

2003-07-09 Thread John Alton Tamplin
Edward Reid wrote:

What RFC3501 says is

 The APPEND command appends the literal argument as a new message
 to the end of the specified destination mailbox.  This argument
 SHOULD be in the format of an [RFC-2822] message.
Since it's the client that constructs the literal, it appears to me
that the RFC is giving the client the choice. The language appears to
specify only the client/server combination. Obviously it would be
better if the language were clearer, but it's not. In other places, the
RFC explicitly refers to the server, for example, the lines immediately
after the above:
 

My interpretation is still that it is describing the protocol and that 
SHOULD in this case allows (but certainly does not require) the client 
or server to relax this restriction where they see fit.  If the server 
is allowed to not relax this restriction, then clearly the client has to 
be prepared to deal with a response from a server that doesn't.  I agree 
there should be clarification there, but as written I don't see how it 
can be interpreted as "the client MAY choose not to send an RFC2822 
message and the server MUST be prepared to accept a message that is not 
in RFC2822 format".  Surely if such a strong requirement of the server 
was intended it would have been said, rather than SHOULD which is 
intended to give the implementation room to relax requirements if necessary.

 8-bit
 characters are permitted in the message.  A server implementation
 that is unable to preserve 8-bit data properly MUST be able to
 reversibly convert 8-bit APPEND data to 7-bit using a [MIME-IMB]
 content transfer encoding.
  Note: There MAY be exceptions, e.g., draft messages, in
  which required [RFC-2822] header lines are omitted in
  the message literal argument to APPEND.  The full
  implications of doing so MUST be understood and
  carefully weighed.
I had not previously noted the implications of that last paragraph. It
implies that even the headers are not necessarily required to be
RFC2822 compliant. But it's even less clear about whether the client or
the server makes the choice. But again, I'm not personally concerned
about headers.
The formal syntax is no help in this case, because it includes no
restriction beyond "literal".
The addition of this example seems to me to strengthen the "SHOULD be an 
RFC2822 message" statement, since it clearly suggests that even a 
reasonable exception (involving only missing header lines but 
maintaining the basic format) requires careful consideration of the 
implications.

Regarding bare newlines, RFC3501 2.2 states:

  All interactions transmitted by client and server are in the form
of
  lines, that is, strings that end with a CRLF.  The protocol
receiver
  of an IMAP4rev1 client or server is either reading a line, or is
  reading a sequence of octets with a known count followed by a line.
   

This clearly does not mean that a literal cannot contain CR or LF -- in
fact, in general the APPEND literal will always contain CRLFs which do
not end the literal. The literal is prefix-coded with the octet count,
and so the rule about lines ending in CRLF does not apply until the
octet count is exhausted. The part of 2.2 that applies to this
situation is the last line, "reading a sequence of octets with a known
count".
 

Yes, sorry -- from RFC2822, 2.1:

  Messages are divided into lines of characters.  A line is a series of
  characters that is delimited with the two characters carriage-return
  and line-feed; that is, the carriage return (CR) character (ASCII
  value 13) followed immediately by the line feed (LF) character (ASCII
  value 10).  (The carriage-return/line-feed pair is usually written in
  this document as "CRLF".)
I guess there is still the debate of whether a client should expect the 
server to store something that isn't an RFC2822 message, and on that I 
guess we will have to disagree until the next IMAP RFC is written and 
clarifies it.

--
John A. Tamplin   Unix System Administrator
Emory University, School of Public Health +1 404/727-9931



Re: APPEND vs RFC2822 vs STD0011

2003-07-09 Thread Edward Reid
At 12:28 PM -0400 7/9/03, Cyrus Daboo wrote:
> On the NULL issue, IMAP does not allow bare NULLs in any data that
>either
> the server or client sends. If you check the formal syntax you will
>see
> that the 'literal' element used to send the message content in an
>APPEND
> explicitly excludes NULL as a valid character.

Ah-ha. Excellent. Thanks, I can definitely run with that.

> Bare CR or LF is another issue...

That's the one that's left.

BTW, in response to email off the list, I should make it clear that I'm
only concerned with the body, not headers.

At 01:35 PM -0400 7/9/03, John Alton Tamplin wrote:
> Clearly a client is required to properly handle a NO or BAD response
>to
> an APPEND command.

I agree. I have other issues with what Eudora considers "proper
handling". They are saying that a dialog box and ceasing operation is
proper handling, I say it's not. Them: "Where does it say that Eudora
is an autoresponder?" Me: "Well, duh, in the User Manual." I did not
mention this part because it's not an issue for this list. It's
certainly going to be part of my next response to Qualcomm, but I
wanted to gather information on the other issues as well.

> The RFC specifies the behavior of
> both the client and the server, and saying the message SHOULD be in
> RFC2822 format means that the server can choose to relax that rule if
>it
> has a good reason just as well as it does for the client.

What RFC3501 says is

  The APPEND command appends the literal argument as a new message
  to the end of the specified destination mailbox.  This argument
  SHOULD be in the format of an [RFC-2822] message.

Since it's the client that constructs the literal, it appears to me
that the RFC is giving the client the choice. The language appears to
specify only the client/server combination. Obviously it would be
better if the language were clearer, but it's not. In other places, the
RFC explicitly refers to the server, for example, the lines immediately
after the above:

  8-bit
  characters are permitted in the message.  A server implementation
  that is unable to preserve 8-bit data properly MUST be able to
  reversibly convert 8-bit APPEND data to 7-bit using a [MIME-IMB]
  content transfer encoding.

   Note: There MAY be exceptions, e.g., draft messages, in
   which required [RFC-2822] header lines are omitted in
   the message literal argument to APPEND.  The full
   implications of doing so MUST be understood and
   carefully weighed.

I had not previously noted the implications of that last paragraph. It
implies that even the headers are not necessarily required to be
RFC2822 compliant. But it's even less clear about whether the client or
the server makes the choice. But again, I'm not personally concerned
about headers.

The formal syntax is no help in this case, because it includes no
restriction beyond "literal".

> RFC3501 specifically requires 2822 rather than 822, and even says that
> all references to 822 should be considered as 2822.  If a mail client
> claims to conform to RFC3501, then the mail messages it sends should
> conform to RFC2822 not RFC822.  If it wants to claim conformance only
>to
> an older IMAP RFC, that is fine.

We are not talking about a case where RFC3501 mentions RFC822. We're
talking about a case where RFC3501 mentions RFC2822 but does not make
it a requirement, and is vague on exactly what is allowed and whether
the client or the server decides what is allowed.

> The data is not allowed to contain nulls -- from RFC3501, 4.3.1:

Thanks. The formal syntax, as Cyrus Daboo mentioned, is clearer. But in
any case, I now have more than adequate documentation about nulls to
hit the Eudora people with.

> Regarding bare newlines, RFC3501 2.2 states:
>
>All interactions transmitted by client and server are in the form
>of
>lines, that is, strings that end with a CRLF.  The protocol
>receiver
>of an IMAP4rev1 client or server is either reading a line, or is
>reading a sequence of octets with a known count followed by a line.

This clearly does not mean that a literal cannot contain CR or LF -- in
fact, in general the APPEND literal will always contain CRLFs which do
not end the literal. The literal is prefix-coded with the octet count,
and so the rule about lines ending in CRLF does not apply until the
octet count is exhausted. The part of 2.2 that applies to this
situation is the last line, "reading a sequence of octets with a known
count".

Mind, you, I'd prefer to find solid reasons to tell Eudora to clean up
its act. I have that now with respect to nulls. Bare newlines still
look ambiguous.

Edward


Re: APPEND vs RFC2822 vs STD0011

2003-07-09 Thread John Alton Tamplin
Edward Reid wrote:

Obviously there's a problem with the RFC in this case, in that it makes
a recommendation to the client but no recommendation or requirement for
the server.
But the RFC clearly says that the client is allowed to store a
non-RFC2822 message, if it has a valid reason. Nowhere do I see that it
says the client should deal with the server refusing to handle cases
which the RFC says the client is allowed to do.
 

Clearly a client is required to properly handle a NO or BAD response to 
an APPEND command.  Since (see below) a client is not permitted to send 
unencoded NUL characters, that would seem to be a violation of the IMAP 
protocol and therefore elicit either a NO (as the server can tell the 
error was in the message text) or a BAD (for a protocol error) tagged 
response.

Aside from the specific case of NUL characters, the client should be 
expected to properly handle a NO response if for whatever reason the 
server is unable to store the data.  The RFC specifies the behavior of 
both the client and the server, and saying the message SHOULD be in 
RFC2822 format means that the server can choose to relax that rule if it 
has a good reason just as well as it does for the client.  Certainly the 
server is not required to accept non-RFC2822 messages, so the client 
should be prepared to handle a refusal if it chooses to relax that 
requirement.

I tried to make it clear that I do not consider storing a non-RFC822
message to be a valid reason, in the RFC2119 sense, to violate the
"SHOULD". IMAP is designed for storing Internet email, and that
requires at minimum RFC822. (RFC733,  the RFC822 predecessor, is far to
old to consider here. RFC822 is over twenty years old; RFC2822 is only
two years old. We long ago reached that point where it's reasonable to
assume that all Internet email is RFC822-compliant, but we just are not
at the point where it's reasonable to assume that all Internet email is
RFC2822-compliant.)
 

RFC3501 specifically requires 2822 rather than 822, and even says that 
all references to 822 should be considered as 2822.  If a mail client 
claims to conform to RFC3501, then the mail messages it sends should 
conform to RFC2822 not RFC822.  If it wants to claim conformance only to 
an older IMAP RFC, that is fine.

Using null-terminated strings with data that might contain nulls is
problematic.
 

The data is not allowed to contain nulls -- from RFC3501, 4.3.1:

  Although a BINARY body encoding is defined, unencoded binary strings
  are not permitted.  A "binary string" is any string with NUL
  characters.  Implementations MUST encode binary data into a textual
  form, such as BASE64, before transmitting the data.  A string with an
  excessive amount of CTL characters MAY also be considered to be
  binary.
Note the use of MUST.  If Eudora or another mail client wants to send 
data containing NULs, it must encode it into another form before doing so.

You are basically saying that even if (emphasize "if") the code is
clearly wrong, that it won't be changed because it's too difficult to
write correct code? I don't follow this argument at all. This problem
with null-terminated strings has been widely known since long before
cyrus.
 

Yes, and since the IMAP spec specifically forbids the presence of 
unencode binary data (defined as strings containing the NUL character), 
it is perfectly reasonable to assume they don't exist.  Cyrus validates 
that the message does not violate the standard by including unencoded 
NULs, and rejects the message if it does.

Regarding bare newlines, RFC3501 2.2 states:

  All interactions transmitted by client and server are in the form of
  lines, that is, strings that end with a CRLF.  The protocol receiver
  of an IMAP4rev1 client or server is either reading a line, or is
  reading a sequence of octets with a known count followed by a line.
--

John A. Tamplin   Unix System Administrator
Emory University, School of Public Health +1 404/727-9931



Re: APPEND vs RFC2822 vs STD0011

2003-07-09 Thread Cyrus Daboo
Hi Edward,

--On Wednesday, July 9, 2003 11:58 AM -0400 Edward Reid <[EMAIL PROTECTED]> 
wrote:

|> Allowing null characters in particular is problematic for any code
|> that
|> uses null-terminated strings for messages or parts of messages, and
|
| Using null-terminated strings with data that might contain nulls is
| problematic.
On the NULL issue, IMAP does not allow bare NULLs in any data that either 
the server or client sends. If you check the formal syntax you will see 
that the 'literal' element used to send the message content in an APPEND 
explicitly excludes NULL as a valid character. So if Eudora is sending bare 
NULLs that is a protocol bug you can bounce back to them and justify having 
them fix.

NB There is an IMAP BINARY extension in the works that does allow bare 
NULLs, but only when used with the specific extension syntax.

Bare CR or LF is another issue...

--
Cyrus Daboo


Re: APPEND vs RFC2822 vs STD0011

2003-07-09 Thread Edward Reid
At 01:30 PM -0400 7/7/03, John Alton Tamplin wrote:
> That's what sieve is for -- do it in the server and you won't have to
> rely on a particular client doing it for you.

OTOH, if I do it in my client, then I don't have to rely on all the
servers I have to deal with all running sieve. They don't, and so I
can't count on it. I only mentioned the one server, but I'm not ready
to count on it being the only server I ever deal with. I have a number
of issues relevant to the way I set up my email, that are not relevant
to the question at hand.

> I disagree - it sounds like it would be defensible if Cyrus supported
> storing such messages even though it is clearly recommended against by
> the standard.  If Eudora insists on storing such messages, it should
>be
> prepared to deal with a server that is unwilling to do so.

Obviously there's a problem with the RFC in this case, in that it makes
a recommendation to the client but no recommendation or requirement for
the server.

But the RFC clearly says that the client is allowed to store a
non-RFC2822 message, if it has a valid reason. Nowhere do I see that it
says the client should deal with the server refusing to handle cases
which the RFC says the client is allowed to do.

More importantly, I don't see that the Eudora people are going to think
they should work around an IMAP server lacking a feature which the RFC
says the client should be able to use, especially when they are
claiming that other IMAP servers don't have this restriction. If I'm
going to go back to Eudora and say they should change their code, I
need a stronger analysis than just "I disagree" -- particularly when
the RFC clearly says the client should be able to do this.

> If you allow the IMAP server to store arbitrary data, it makes the
>other
> functions much more difficult

I tried to make it clear that I do not consider storing a non-RFC822
message to be a valid reason, in the RFC2119 sense, to violate the
"SHOULD". IMAP is designed for storing Internet email, and that
requires at minimum RFC822. (RFC733,  the RFC822 predecessor, is far to
old to consider here. RFC822 is over twenty years old; RFC2822 is only
two years old. We long ago reached that point where it's reasonable to
assume that all Internet email is RFC822-compliant, but we just are not
at the point where it's reasonable to assume that all Internet email is
RFC2822-compliant.)

> Allowing null characters in particular is problematic for any code
>that
> uses null-terminated strings for messages or parts of messages, and

Using null-terminated strings with data that might contain nulls is
problematic.

> would require changing the code everywhere to use and pass the length
>of
> all the strings instead.

You are basically saying that even if (emphasize "if") the code is
clearly wrong, that it won't be changed because it's too difficult to
write correct code? I don't follow this argument at all. This problem
with null-terminated strings has been widely known since long before
cyrus.

> As far as STD0011 not being obsoleted, there are plenty of RFCs etc.
> that are not obsoleted by something but are still not best current
> practice.

But it's very seldom that two years is considered an adequate
transition.

> Clearly if the RFC it has been based on has been obsoleted,
> the STD should be updated as well.

But rfc-editor.org says it hasn't been.

I'd very much like to see more discussion on this, and not just a
brush-off "that's not how we do it here".

Edward


Re: APPEND vs RFC2822 vs STD0011

2003-07-07 Thread John Alton Tamplin
Edward Reid wrote:

The mail provider (MX) for my domain, fastmail.fm, runs cyrus. I use
Eudora (for Mac, v5.2), mostly in POP mode, but I use some IMAP
features too. In particular, some of my filters copy incoming (POP)
messages to an IMAP mailbox at fastmail.fm. That's where the problems
start.
Some of these incoming messages contain NULs or bare CR or LF. Yes, the
sender is broken as far as RFC2822 is concerned, but the messages get
through anyway. And the messages are valid RFC822/STD0011 format.
When Eudora tries to copy these (APPEND them) to the IMAP mailbox,
cyrus (at fastmail.fm) returns an error. I could live with an
occasional copy failure, but the worst part is that when Eudora gets
the server error, it thinks it's a terrible problem and throws up a
dialog box and ceases all processing. Since I (like many people) depend
on Eudora cleaning up my mailbox and doing other things with incoming
mail automatically when I'm not at my desk, this gets to be a serious
problem.
 

That's what sieve is for -- do it in the server and you won't have to 
rely on a particular client doing it for you.

So I started reading RFC3501 to find the reason. I assumed that I'd
find a good reason that I could quote to Eudora support, telling them
why Eudora has to clean up the message before storing it in an IMAP
mailbox. But I didn't find that. What I found -- under the APPEND
command (section 6.3.11) -- is
: The APPEND command appends the literal argument as a new message
: to the end of the specified destination mailbox. This argument SHOULD
: be in the format of an [RFC-2822] message.
Note well: that's "SHOULD", not "MUST". This is important. RFC2119
gives the meaning of SHOULD:
: This word [...] mean[s] that there may exist valid reasons in
: particular circumstances to ignore a particular item [...]
So based on my reading of the RFC, it's the client's choice: it should
normally append RFC2822 messages, but if it has a valid reason, it's
allowed to append something that's not RFC2822. Now, IMAP mailboxes are
intended for email -- "Internet message format" or "Internet text
messages" in the RFC language -- and so it would be hard to make a case
for storing anything that's not such a message. But RFC822 messages are
still rampant on the Internet. In fact, as I understand it, although
RFC2822 has obsoleted RFC822, STD0011 (which is identical to RFC822) is
still a standard and has not yet been superseded.
And it certainly seems to me that making a copy of an existing message
is a "valid reason" for copying it intact, without the modifications
needed to force it to conform to the stricter format of RFC2822. Since
RFC3501 leaves this decision up to the client, it follows that cyrus is
broken when it refuses the message.  If RFC3501 said "MUST", then I'd
say it's Eudora's responsibility to fix the message before attempting
an APPEND. But the RFC says "SHOULD".
Is there any good argument for cyrus' action? If there is, I'd be happy
to take it to Eudora and push them to fix Eudora. Eudora's not exactly
known for its stellar IMAP support, and I'd like them to improve this.
I've shoved the RFCs in their face plenty of times in the past. But in
this case, my reading of the RFCs is that Eudora's APPEND action is
defensible and cyrus' action is incorrect.
 

I disagree - it sounds like it would be defensible if Cyrus supported 
storing such messages even though it is clearly recommended against by 
the standard.  If Eudora insists on storing such messages, it should be 
prepared to deal with a server that is unwilling to do so.

If you allow the IMAP server to store arbitrary data, it makes the other 
functions much more difficult -- if it can't assume there is a 
message-id that is globally unique, it has to create its own unique key 
for a message.  Searching based on header fields is more problematic 
since you can't assume there is even a header or that if there is one 
that it's value corresponds to anything you might expect (ie, character 
set, line length, and other issues).

Allowing null characters in particular is problematic for any code that 
uses null-terminated strings for messages or parts of messages, and 
would require changing the code everywhere to use and pass the length of 
all the strings instead.  I don't know if there is a technical reason 
behind not supporting bare newlines or not.

As far as STD0011 not being obsoleted, there are plenty of RFCs etc. 
that are not obsoleted by something but are still not best current 
practice.  Clearly if the RFC it has been based on has been obsoleted, 
the STD should be updated as well.

--
John A. Tamplin   Unix System Administrator
Emory University, School of Public Health +1 404/727-9931