Re: APPEND vs RFC2822 vs STD0011

2003-07-09 Thread Cyrus Daboo
Hi Edward,

--On Wednesday, July 9, 2003 11:58 AM -0400 Edward Reid [EMAIL PROTECTED] 
wrote:

| Allowing null characters in particular is problematic for any code
| that
| uses null-terminated strings for messages or parts of messages, and
|
| Using null-terminated strings with data that might contain nulls is
| problematic.
On the NULL issue, IMAP does not allow bare NULLs in any data that either 
the server or client sends. If you check the formal syntax you will see 
that the 'literal' element used to send the message content in an APPEND 
explicitly excludes NULL as a valid character. So if Eudora is sending bare 
NULLs that is a protocol bug you can bounce back to them and justify having 
them fix.

NB There is an IMAP BINARY extension in the works that does allow bare 
NULLs, but only when used with the specific extension syntax.

Bare CR or LF is another issue...

--
Cyrus Daboo


Re: APPEND vs RFC2822 vs STD0011

2003-07-09 Thread Edward Reid
At 12:28 PM -0400 7/9/03, Cyrus Daboo wrote:
 On the NULL issue, IMAP does not allow bare NULLs in any data that
either
 the server or client sends. If you check the formal syntax you will
see
 that the 'literal' element used to send the message content in an
APPEND
 explicitly excludes NULL as a valid character.

Ah-ha. Excellent. Thanks, I can definitely run with that.

 Bare CR or LF is another issue...

That's the one that's left.

BTW, in response to email off the list, I should make it clear that I'm
only concerned with the body, not headers.

At 01:35 PM -0400 7/9/03, John Alton Tamplin wrote:
 Clearly a client is required to properly handle a NO or BAD response
to
 an APPEND command.

I agree. I have other issues with what Eudora considers proper
handling. They are saying that a dialog box and ceasing operation is
proper handling, I say it's not. Them: Where does it say that Eudora
is an autoresponder? Me: Well, duh, in the User Manual. I did not
mention this part because it's not an issue for this list. It's
certainly going to be part of my next response to Qualcomm, but I
wanted to gather information on the other issues as well.

 The RFC specifies the behavior of
 both the client and the server, and saying the message SHOULD be in
 RFC2822 format means that the server can choose to relax that rule if
it
 has a good reason just as well as it does for the client.

What RFC3501 says is

  The APPEND command appends the literal argument as a new message
  to the end of the specified destination mailbox.  This argument
  SHOULD be in the format of an [RFC-2822] message.

Since it's the client that constructs the literal, it appears to me
that the RFC is giving the client the choice. The language appears to
specify only the client/server combination. Obviously it would be
better if the language were clearer, but it's not. In other places, the
RFC explicitly refers to the server, for example, the lines immediately
after the above:

  8-bit
  characters are permitted in the message.  A server implementation
  that is unable to preserve 8-bit data properly MUST be able to
  reversibly convert 8-bit APPEND data to 7-bit using a [MIME-IMB]
  content transfer encoding.

   Note: There MAY be exceptions, e.g., draft messages, in
   which required [RFC-2822] header lines are omitted in
   the message literal argument to APPEND.  The full
   implications of doing so MUST be understood and
   carefully weighed.

I had not previously noted the implications of that last paragraph. It
implies that even the headers are not necessarily required to be
RFC2822 compliant. But it's even less clear about whether the client or
the server makes the choice. But again, I'm not personally concerned
about headers.

The formal syntax is no help in this case, because it includes no
restriction beyond literal.

 RFC3501 specifically requires 2822 rather than 822, and even says that
 all references to 822 should be considered as 2822.  If a mail client
 claims to conform to RFC3501, then the mail messages it sends should
 conform to RFC2822 not RFC822.  If it wants to claim conformance only
to
 an older IMAP RFC, that is fine.

We are not talking about a case where RFC3501 mentions RFC822. We're
talking about a case where RFC3501 mentions RFC2822 but does not make
it a requirement, and is vague on exactly what is allowed and whether
the client or the server decides what is allowed.

 The data is not allowed to contain nulls -- from RFC3501, 4.3.1:

Thanks. The formal syntax, as Cyrus Daboo mentioned, is clearer. But in
any case, I now have more than adequate documentation about nulls to
hit the Eudora people with.

 Regarding bare newlines, RFC3501 2.2 states:

All interactions transmitted by client and server are in the form
of
lines, that is, strings that end with a CRLF.  The protocol
receiver
of an IMAP4rev1 client or server is either reading a line, or is
reading a sequence of octets with a known count followed by a line.

This clearly does not mean that a literal cannot contain CR or LF -- in
fact, in general the APPEND literal will always contain CRLFs which do
not end the literal. The literal is prefix-coded with the octet count,
and so the rule about lines ending in CRLF does not apply until the
octet count is exhausted. The part of 2.2 that applies to this
situation is the last line, reading a sequence of octets with a known
count.

Mind, you, I'd prefer to find solid reasons to tell Eudora to clean up
its act. I have that now with respect to nulls. Bare newlines still
look ambiguous.

Edward


Re: APPEND vs RFC2822 vs STD0011

2003-07-09 Thread John Alton Tamplin
Edward Reid wrote:

What RFC3501 says is

 The APPEND command appends the literal argument as a new message
 to the end of the specified destination mailbox.  This argument
 SHOULD be in the format of an [RFC-2822] message.
Since it's the client that constructs the literal, it appears to me
that the RFC is giving the client the choice. The language appears to
specify only the client/server combination. Obviously it would be
better if the language were clearer, but it's not. In other places, the
RFC explicitly refers to the server, for example, the lines immediately
after the above:
 

My interpretation is still that it is describing the protocol and that 
SHOULD in this case allows (but certainly does not require) the client 
or server to relax this restriction where they see fit.  If the server 
is allowed to not relax this restriction, then clearly the client has to 
be prepared to deal with a response from a server that doesn't.  I agree 
there should be clarification there, but as written I don't see how it 
can be interpreted as the client MAY choose not to send an RFC2822 
message and the server MUST be prepared to accept a message that is not 
in RFC2822 format.  Surely if such a strong requirement of the server 
was intended it would have been said, rather than SHOULD which is 
intended to give the implementation room to relax requirements if necessary.

 8-bit
 characters are permitted in the message.  A server implementation
 that is unable to preserve 8-bit data properly MUST be able to
 reversibly convert 8-bit APPEND data to 7-bit using a [MIME-IMB]
 content transfer encoding.
  Note: There MAY be exceptions, e.g., draft messages, in
  which required [RFC-2822] header lines are omitted in
  the message literal argument to APPEND.  The full
  implications of doing so MUST be understood and
  carefully weighed.
I had not previously noted the implications of that last paragraph. It
implies that even the headers are not necessarily required to be
RFC2822 compliant. But it's even less clear about whether the client or
the server makes the choice. But again, I'm not personally concerned
about headers.
The formal syntax is no help in this case, because it includes no
restriction beyond literal.
The addition of this example seems to me to strengthen the SHOULD be an 
RFC2822 message statement, since it clearly suggests that even a 
reasonable exception (involving only missing header lines but 
maintaining the basic format) requires careful consideration of the 
implications.

Regarding bare newlines, RFC3501 2.2 states:

  All interactions transmitted by client and server are in the form
of
  lines, that is, strings that end with a CRLF.  The protocol
receiver
  of an IMAP4rev1 client or server is either reading a line, or is
  reading a sequence of octets with a known count followed by a line.
   

This clearly does not mean that a literal cannot contain CR or LF -- in
fact, in general the APPEND literal will always contain CRLFs which do
not end the literal. The literal is prefix-coded with the octet count,
and so the rule about lines ending in CRLF does not apply until the
octet count is exhausted. The part of 2.2 that applies to this
situation is the last line, reading a sequence of octets with a known
count.
 

Yes, sorry -- from RFC2822, 2.1:

  Messages are divided into lines of characters.  A line is a series of
  characters that is delimited with the two characters carriage-return
  and line-feed; that is, the carriage return (CR) character (ASCII
  value 13) followed immediately by the line feed (LF) character (ASCII
  value 10).  (The carriage-return/line-feed pair is usually written in
  this document as CRLF.)
I guess there is still the debate of whether a client should expect the 
server to store something that isn't an RFC2822 message, and on that I 
guess we will have to disagree until the next IMAP RFC is written and 
clarifies it.

--
John A. Tamplin   Unix System Administrator
Emory University, School of Public Health +1 404/727-9931



APPEND vs RFC2822 vs STD0011

2003-07-07 Thread Edward Reid
The mail provider (MX) for my domain, fastmail.fm, runs cyrus. I use
Eudora (for Mac, v5.2), mostly in POP mode, but I use some IMAP
features too. In particular, some of my filters copy incoming (POP)
messages to an IMAP mailbox at fastmail.fm. That's where the problems
start.

Some of these incoming messages contain NULs or bare CR or LF. Yes, the
sender is broken as far as RFC2822 is concerned, but the messages get
through anyway. And the messages are valid RFC822/STD0011 format.

When Eudora tries to copy these (APPEND them) to the IMAP mailbox,
cyrus (at fastmail.fm) returns an error. I could live with an
occasional copy failure, but the worst part is that when Eudora gets
the server error, it thinks it's a terrible problem and throws up a
dialog box and ceases all processing. Since I (like many people) depend
on Eudora cleaning up my mailbox and doing other things with incoming
mail automatically when I'm not at my desk, this gets to be a serious
problem.

So I started reading RFC3501 to find the reason. I assumed that I'd
find a good reason that I could quote to Eudora support, telling them
why Eudora has to clean up the message before storing it in an IMAP
mailbox. But I didn't find that. What I found -- under the APPEND
command (section 6.3.11) -- is

: The APPEND command appends the literal argument as a new message
: to the end of the specified destination mailbox. This argument SHOULD
: be in the format of an [RFC-2822] message.

Note well: that's SHOULD, not MUST. This is important. RFC2119
gives the meaning of SHOULD:

: This word [...] mean[s] that there may exist valid reasons in
: particular circumstances to ignore a particular item [...]

So based on my reading of the RFC, it's the client's choice: it should
normally append RFC2822 messages, but if it has a valid reason, it's
allowed to append something that's not RFC2822. Now, IMAP mailboxes are
intended for email -- Internet message format or Internet text
messages in the RFC language -- and so it would be hard to make a case
for storing anything that's not such a message. But RFC822 messages are
still rampant on the Internet. In fact, as I understand it, although
RFC2822 has obsoleted RFC822, STD0011 (which is identical to RFC822) is
still a standard and has not yet been superseded.

And it certainly seems to me that making a copy of an existing message
is a valid reason for copying it intact, without the modifications
needed to force it to conform to the stricter format of RFC2822. Since
RFC3501 leaves this decision up to the client, it follows that cyrus is
broken when it refuses the message.  If RFC3501 said MUST, then I'd
say it's Eudora's responsibility to fix the message before attempting
an APPEND. But the RFC says SHOULD.

Is there any good argument for cyrus' action? If there is, I'd be happy
to take it to Eudora and push them to fix Eudora. Eudora's not exactly
known for its stellar IMAP support, and I'd like them to improve this.
I've shoved the RFCs in their face plenty of times in the past. But in
this case, my reading of the RFCs is that Eudora's APPEND action is
defensible and cyrus' action is incorrect.

I'm very interested in hearing the cases for both sides.

Edward


Re: APPEND vs RFC2822 vs STD0011

2003-07-07 Thread John Alton Tamplin
Edward Reid wrote:

The mail provider (MX) for my domain, fastmail.fm, runs cyrus. I use
Eudora (for Mac, v5.2), mostly in POP mode, but I use some IMAP
features too. In particular, some of my filters copy incoming (POP)
messages to an IMAP mailbox at fastmail.fm. That's where the problems
start.
Some of these incoming messages contain NULs or bare CR or LF. Yes, the
sender is broken as far as RFC2822 is concerned, but the messages get
through anyway. And the messages are valid RFC822/STD0011 format.
When Eudora tries to copy these (APPEND them) to the IMAP mailbox,
cyrus (at fastmail.fm) returns an error. I could live with an
occasional copy failure, but the worst part is that when Eudora gets
the server error, it thinks it's a terrible problem and throws up a
dialog box and ceases all processing. Since I (like many people) depend
on Eudora cleaning up my mailbox and doing other things with incoming
mail automatically when I'm not at my desk, this gets to be a serious
problem.
 

That's what sieve is for -- do it in the server and you won't have to 
rely on a particular client doing it for you.

So I started reading RFC3501 to find the reason. I assumed that I'd
find a good reason that I could quote to Eudora support, telling them
why Eudora has to clean up the message before storing it in an IMAP
mailbox. But I didn't find that. What I found -- under the APPEND
command (section 6.3.11) -- is
: The APPEND command appends the literal argument as a new message
: to the end of the specified destination mailbox. This argument SHOULD
: be in the format of an [RFC-2822] message.
Note well: that's SHOULD, not MUST. This is important. RFC2119
gives the meaning of SHOULD:
: This word [...] mean[s] that there may exist valid reasons in
: particular circumstances to ignore a particular item [...]
So based on my reading of the RFC, it's the client's choice: it should
normally append RFC2822 messages, but if it has a valid reason, it's
allowed to append something that's not RFC2822. Now, IMAP mailboxes are
intended for email -- Internet message format or Internet text
messages in the RFC language -- and so it would be hard to make a case
for storing anything that's not such a message. But RFC822 messages are
still rampant on the Internet. In fact, as I understand it, although
RFC2822 has obsoleted RFC822, STD0011 (which is identical to RFC822) is
still a standard and has not yet been superseded.
And it certainly seems to me that making a copy of an existing message
is a valid reason for copying it intact, without the modifications
needed to force it to conform to the stricter format of RFC2822. Since
RFC3501 leaves this decision up to the client, it follows that cyrus is
broken when it refuses the message.  If RFC3501 said MUST, then I'd
say it's Eudora's responsibility to fix the message before attempting
an APPEND. But the RFC says SHOULD.
Is there any good argument for cyrus' action? If there is, I'd be happy
to take it to Eudora and push them to fix Eudora. Eudora's not exactly
known for its stellar IMAP support, and I'd like them to improve this.
I've shoved the RFCs in their face plenty of times in the past. But in
this case, my reading of the RFCs is that Eudora's APPEND action is
defensible and cyrus' action is incorrect.
 

I disagree - it sounds like it would be defensible if Cyrus supported 
storing such messages even though it is clearly recommended against by 
the standard.  If Eudora insists on storing such messages, it should be 
prepared to deal with a server that is unwilling to do so.

If you allow the IMAP server to store arbitrary data, it makes the other 
functions much more difficult -- if it can't assume there is a 
message-id that is globally unique, it has to create its own unique key 
for a message.  Searching based on header fields is more problematic 
since you can't assume there is even a header or that if there is one 
that it's value corresponds to anything you might expect (ie, character 
set, line length, and other issues).

Allowing null characters in particular is problematic for any code that 
uses null-terminated strings for messages or parts of messages, and 
would require changing the code everywhere to use and pass the length of 
all the strings instead.  I don't know if there is a technical reason 
behind not supporting bare newlines or not.

As far as STD0011 not being obsoleted, there are plenty of RFCs etc. 
that are not obsoleted by something but are still not best current 
practice.  Clearly if the RFC it has been based on has been obsoleted, 
the STD should be updated as well.

--
John A. Tamplin   Unix System Administrator
Emory University, School of Public Health +1 404/727-9931