Re: APPEND vs RFC2822 vs STD0011
Edward Reid wrote: What RFC3501 says is The APPEND command appends the literal argument as a new message to the end of the specified destination mailbox. This argument SHOULD be in the format of an [RFC-2822] message. Since it's the client that constructs the literal, it appears to me that the RFC is giving the client the choice. The language appears to specify only the client/server combination. Obviously it would be better if the language were clearer, but it's not. In other places, the RFC explicitly refers to the server, for example, the lines immediately after the above: My interpretation is still that it is describing the protocol and that SHOULD in this case allows (but certainly does not require) the client or server to relax this restriction where they see fit. If the server is allowed to not relax this restriction, then clearly the client has to be prepared to deal with a response from a server that doesn't. I agree there should be clarification there, but as written I don't see how it can be interpreted as "the client MAY choose not to send an RFC2822 message and the server MUST be prepared to accept a message that is not in RFC2822 format". Surely if such a strong requirement of the server was intended it would have been said, rather than SHOULD which is intended to give the implementation room to relax requirements if necessary. 8-bit characters are permitted in the message. A server implementation that is unable to preserve 8-bit data properly MUST be able to reversibly convert 8-bit APPEND data to 7-bit using a [MIME-IMB] content transfer encoding. Note: There MAY be exceptions, e.g., draft messages, in which required [RFC-2822] header lines are omitted in the message literal argument to APPEND. The full implications of doing so MUST be understood and carefully weighed. I had not previously noted the implications of that last paragraph. It implies that even the headers are not necessarily required to be RFC2822 compliant. But it's even less clear about whether the client or the server makes the choice. But again, I'm not personally concerned about headers. The formal syntax is no help in this case, because it includes no restriction beyond "literal". The addition of this example seems to me to strengthen the "SHOULD be an RFC2822 message" statement, since it clearly suggests that even a reasonable exception (involving only missing header lines but maintaining the basic format) requires careful consideration of the implications. Regarding bare newlines, RFC3501 2.2 states: All interactions transmitted by client and server are in the form of lines, that is, strings that end with a CRLF. The protocol receiver of an IMAP4rev1 client or server is either reading a line, or is reading a sequence of octets with a known count followed by a line. This clearly does not mean that a literal cannot contain CR or LF -- in fact, in general the APPEND literal will always contain CRLFs which do not end the literal. The literal is prefix-coded with the octet count, and so the rule about lines ending in CRLF does not apply until the octet count is exhausted. The part of 2.2 that applies to this situation is the last line, "reading a sequence of octets with a known count". Yes, sorry -- from RFC2822, 2.1: Messages are divided into lines of characters. A line is a series of characters that is delimited with the two characters carriage-return and line-feed; that is, the carriage return (CR) character (ASCII value 13) followed immediately by the line feed (LF) character (ASCII value 10). (The carriage-return/line-feed pair is usually written in this document as "CRLF".) I guess there is still the debate of whether a client should expect the server to store something that isn't an RFC2822 message, and on that I guess we will have to disagree until the next IMAP RFC is written and clarifies it. -- John A. Tamplin Unix System Administrator Emory University, School of Public Health +1 404/727-9931
Re: APPEND vs RFC2822 vs STD0011
At 12:28 PM -0400 7/9/03, Cyrus Daboo wrote: > On the NULL issue, IMAP does not allow bare NULLs in any data that >either > the server or client sends. If you check the formal syntax you will >see > that the 'literal' element used to send the message content in an >APPEND > explicitly excludes NULL as a valid character. Ah-ha. Excellent. Thanks, I can definitely run with that. > Bare CR or LF is another issue... That's the one that's left. BTW, in response to email off the list, I should make it clear that I'm only concerned with the body, not headers. At 01:35 PM -0400 7/9/03, John Alton Tamplin wrote: > Clearly a client is required to properly handle a NO or BAD response >to > an APPEND command. I agree. I have other issues with what Eudora considers "proper handling". They are saying that a dialog box and ceasing operation is proper handling, I say it's not. Them: "Where does it say that Eudora is an autoresponder?" Me: "Well, duh, in the User Manual." I did not mention this part because it's not an issue for this list. It's certainly going to be part of my next response to Qualcomm, but I wanted to gather information on the other issues as well. > The RFC specifies the behavior of > both the client and the server, and saying the message SHOULD be in > RFC2822 format means that the server can choose to relax that rule if >it > has a good reason just as well as it does for the client. What RFC3501 says is The APPEND command appends the literal argument as a new message to the end of the specified destination mailbox. This argument SHOULD be in the format of an [RFC-2822] message. Since it's the client that constructs the literal, it appears to me that the RFC is giving the client the choice. The language appears to specify only the client/server combination. Obviously it would be better if the language were clearer, but it's not. In other places, the RFC explicitly refers to the server, for example, the lines immediately after the above: 8-bit characters are permitted in the message. A server implementation that is unable to preserve 8-bit data properly MUST be able to reversibly convert 8-bit APPEND data to 7-bit using a [MIME-IMB] content transfer encoding. Note: There MAY be exceptions, e.g., draft messages, in which required [RFC-2822] header lines are omitted in the message literal argument to APPEND. The full implications of doing so MUST be understood and carefully weighed. I had not previously noted the implications of that last paragraph. It implies that even the headers are not necessarily required to be RFC2822 compliant. But it's even less clear about whether the client or the server makes the choice. But again, I'm not personally concerned about headers. The formal syntax is no help in this case, because it includes no restriction beyond "literal". > RFC3501 specifically requires 2822 rather than 822, and even says that > all references to 822 should be considered as 2822. If a mail client > claims to conform to RFC3501, then the mail messages it sends should > conform to RFC2822 not RFC822. If it wants to claim conformance only >to > an older IMAP RFC, that is fine. We are not talking about a case where RFC3501 mentions RFC822. We're talking about a case where RFC3501 mentions RFC2822 but does not make it a requirement, and is vague on exactly what is allowed and whether the client or the server decides what is allowed. > The data is not allowed to contain nulls -- from RFC3501, 4.3.1: Thanks. The formal syntax, as Cyrus Daboo mentioned, is clearer. But in any case, I now have more than adequate documentation about nulls to hit the Eudora people with. > Regarding bare newlines, RFC3501 2.2 states: > >All interactions transmitted by client and server are in the form >of >lines, that is, strings that end with a CRLF. The protocol >receiver >of an IMAP4rev1 client or server is either reading a line, or is >reading a sequence of octets with a known count followed by a line. This clearly does not mean that a literal cannot contain CR or LF -- in fact, in general the APPEND literal will always contain CRLFs which do not end the literal. The literal is prefix-coded with the octet count, and so the rule about lines ending in CRLF does not apply until the octet count is exhausted. The part of 2.2 that applies to this situation is the last line, "reading a sequence of octets with a known count". Mind, you, I'd prefer to find solid reasons to tell Eudora to clean up its act. I have that now with respect to nulls. Bare newlines still look ambiguous. Edward
Re: APPEND vs RFC2822 vs STD0011
Edward Reid wrote: Obviously there's a problem with the RFC in this case, in that it makes a recommendation to the client but no recommendation or requirement for the server. But the RFC clearly says that the client is allowed to store a non-RFC2822 message, if it has a valid reason. Nowhere do I see that it says the client should deal with the server refusing to handle cases which the RFC says the client is allowed to do. Clearly a client is required to properly handle a NO or BAD response to an APPEND command. Since (see below) a client is not permitted to send unencoded NUL characters, that would seem to be a violation of the IMAP protocol and therefore elicit either a NO (as the server can tell the error was in the message text) or a BAD (for a protocol error) tagged response. Aside from the specific case of NUL characters, the client should be expected to properly handle a NO response if for whatever reason the server is unable to store the data. The RFC specifies the behavior of both the client and the server, and saying the message SHOULD be in RFC2822 format means that the server can choose to relax that rule if it has a good reason just as well as it does for the client. Certainly the server is not required to accept non-RFC2822 messages, so the client should be prepared to handle a refusal if it chooses to relax that requirement. I tried to make it clear that I do not consider storing a non-RFC822 message to be a valid reason, in the RFC2119 sense, to violate the "SHOULD". IMAP is designed for storing Internet email, and that requires at minimum RFC822. (RFC733, the RFC822 predecessor, is far to old to consider here. RFC822 is over twenty years old; RFC2822 is only two years old. We long ago reached that point where it's reasonable to assume that all Internet email is RFC822-compliant, but we just are not at the point where it's reasonable to assume that all Internet email is RFC2822-compliant.) RFC3501 specifically requires 2822 rather than 822, and even says that all references to 822 should be considered as 2822. If a mail client claims to conform to RFC3501, then the mail messages it sends should conform to RFC2822 not RFC822. If it wants to claim conformance only to an older IMAP RFC, that is fine. Using null-terminated strings with data that might contain nulls is problematic. The data is not allowed to contain nulls -- from RFC3501, 4.3.1: Although a BINARY body encoding is defined, unencoded binary strings are not permitted. A "binary string" is any string with NUL characters. Implementations MUST encode binary data into a textual form, such as BASE64, before transmitting the data. A string with an excessive amount of CTL characters MAY also be considered to be binary. Note the use of MUST. If Eudora or another mail client wants to send data containing NULs, it must encode it into another form before doing so. You are basically saying that even if (emphasize "if") the code is clearly wrong, that it won't be changed because it's too difficult to write correct code? I don't follow this argument at all. This problem with null-terminated strings has been widely known since long before cyrus. Yes, and since the IMAP spec specifically forbids the presence of unencode binary data (defined as strings containing the NUL character), it is perfectly reasonable to assume they don't exist. Cyrus validates that the message does not violate the standard by including unencoded NULs, and rejects the message if it does. Regarding bare newlines, RFC3501 2.2 states: All interactions transmitted by client and server are in the form of lines, that is, strings that end with a CRLF. The protocol receiver of an IMAP4rev1 client or server is either reading a line, or is reading a sequence of octets with a known count followed by a line. -- John A. Tamplin Unix System Administrator Emory University, School of Public Health +1 404/727-9931
Re: APPEND vs RFC2822 vs STD0011
Hi Edward, --On Wednesday, July 9, 2003 11:58 AM -0400 Edward Reid <[EMAIL PROTECTED]> wrote: |> Allowing null characters in particular is problematic for any code |> that |> uses null-terminated strings for messages or parts of messages, and | | Using null-terminated strings with data that might contain nulls is | problematic. On the NULL issue, IMAP does not allow bare NULLs in any data that either the server or client sends. If you check the formal syntax you will see that the 'literal' element used to send the message content in an APPEND explicitly excludes NULL as a valid character. So if Eudora is sending bare NULLs that is a protocol bug you can bounce back to them and justify having them fix. NB There is an IMAP BINARY extension in the works that does allow bare NULLs, but only when used with the specific extension syntax. Bare CR or LF is another issue... -- Cyrus Daboo
Re: APPEND vs RFC2822 vs STD0011
At 01:30 PM -0400 7/7/03, John Alton Tamplin wrote: > That's what sieve is for -- do it in the server and you won't have to > rely on a particular client doing it for you. OTOH, if I do it in my client, then I don't have to rely on all the servers I have to deal with all running sieve. They don't, and so I can't count on it. I only mentioned the one server, but I'm not ready to count on it being the only server I ever deal with. I have a number of issues relevant to the way I set up my email, that are not relevant to the question at hand. > I disagree - it sounds like it would be defensible if Cyrus supported > storing such messages even though it is clearly recommended against by > the standard. If Eudora insists on storing such messages, it should >be > prepared to deal with a server that is unwilling to do so. Obviously there's a problem with the RFC in this case, in that it makes a recommendation to the client but no recommendation or requirement for the server. But the RFC clearly says that the client is allowed to store a non-RFC2822 message, if it has a valid reason. Nowhere do I see that it says the client should deal with the server refusing to handle cases which the RFC says the client is allowed to do. More importantly, I don't see that the Eudora people are going to think they should work around an IMAP server lacking a feature which the RFC says the client should be able to use, especially when they are claiming that other IMAP servers don't have this restriction. If I'm going to go back to Eudora and say they should change their code, I need a stronger analysis than just "I disagree" -- particularly when the RFC clearly says the client should be able to do this. > If you allow the IMAP server to store arbitrary data, it makes the >other > functions much more difficult I tried to make it clear that I do not consider storing a non-RFC822 message to be a valid reason, in the RFC2119 sense, to violate the "SHOULD". IMAP is designed for storing Internet email, and that requires at minimum RFC822. (RFC733, the RFC822 predecessor, is far to old to consider here. RFC822 is over twenty years old; RFC2822 is only two years old. We long ago reached that point where it's reasonable to assume that all Internet email is RFC822-compliant, but we just are not at the point where it's reasonable to assume that all Internet email is RFC2822-compliant.) > Allowing null characters in particular is problematic for any code >that > uses null-terminated strings for messages or parts of messages, and Using null-terminated strings with data that might contain nulls is problematic. > would require changing the code everywhere to use and pass the length >of > all the strings instead. You are basically saying that even if (emphasize "if") the code is clearly wrong, that it won't be changed because it's too difficult to write correct code? I don't follow this argument at all. This problem with null-terminated strings has been widely known since long before cyrus. > As far as STD0011 not being obsoleted, there are plenty of RFCs etc. > that are not obsoleted by something but are still not best current > practice. But it's very seldom that two years is considered an adequate transition. > Clearly if the RFC it has been based on has been obsoleted, > the STD should be updated as well. But rfc-editor.org says it hasn't been. I'd very much like to see more discussion on this, and not just a brush-off "that's not how we do it here". Edward
Re: APPEND vs RFC2822 vs STD0011
Edward Reid wrote: The mail provider (MX) for my domain, fastmail.fm, runs cyrus. I use Eudora (for Mac, v5.2), mostly in POP mode, but I use some IMAP features too. In particular, some of my filters copy incoming (POP) messages to an IMAP mailbox at fastmail.fm. That's where the problems start. Some of these incoming messages contain NULs or bare CR or LF. Yes, the sender is broken as far as RFC2822 is concerned, but the messages get through anyway. And the messages are valid RFC822/STD0011 format. When Eudora tries to copy these (APPEND them) to the IMAP mailbox, cyrus (at fastmail.fm) returns an error. I could live with an occasional copy failure, but the worst part is that when Eudora gets the server error, it thinks it's a terrible problem and throws up a dialog box and ceases all processing. Since I (like many people) depend on Eudora cleaning up my mailbox and doing other things with incoming mail automatically when I'm not at my desk, this gets to be a serious problem. That's what sieve is for -- do it in the server and you won't have to rely on a particular client doing it for you. So I started reading RFC3501 to find the reason. I assumed that I'd find a good reason that I could quote to Eudora support, telling them why Eudora has to clean up the message before storing it in an IMAP mailbox. But I didn't find that. What I found -- under the APPEND command (section 6.3.11) -- is : The APPEND command appends the literal argument as a new message : to the end of the specified destination mailbox. This argument SHOULD : be in the format of an [RFC-2822] message. Note well: that's "SHOULD", not "MUST". This is important. RFC2119 gives the meaning of SHOULD: : This word [...] mean[s] that there may exist valid reasons in : particular circumstances to ignore a particular item [...] So based on my reading of the RFC, it's the client's choice: it should normally append RFC2822 messages, but if it has a valid reason, it's allowed to append something that's not RFC2822. Now, IMAP mailboxes are intended for email -- "Internet message format" or "Internet text messages" in the RFC language -- and so it would be hard to make a case for storing anything that's not such a message. But RFC822 messages are still rampant on the Internet. In fact, as I understand it, although RFC2822 has obsoleted RFC822, STD0011 (which is identical to RFC822) is still a standard and has not yet been superseded. And it certainly seems to me that making a copy of an existing message is a "valid reason" for copying it intact, without the modifications needed to force it to conform to the stricter format of RFC2822. Since RFC3501 leaves this decision up to the client, it follows that cyrus is broken when it refuses the message. If RFC3501 said "MUST", then I'd say it's Eudora's responsibility to fix the message before attempting an APPEND. But the RFC says "SHOULD". Is there any good argument for cyrus' action? If there is, I'd be happy to take it to Eudora and push them to fix Eudora. Eudora's not exactly known for its stellar IMAP support, and I'd like them to improve this. I've shoved the RFCs in their face plenty of times in the past. But in this case, my reading of the RFCs is that Eudora's APPEND action is defensible and cyrus' action is incorrect. I disagree - it sounds like it would be defensible if Cyrus supported storing such messages even though it is clearly recommended against by the standard. If Eudora insists on storing such messages, it should be prepared to deal with a server that is unwilling to do so. If you allow the IMAP server to store arbitrary data, it makes the other functions much more difficult -- if it can't assume there is a message-id that is globally unique, it has to create its own unique key for a message. Searching based on header fields is more problematic since you can't assume there is even a header or that if there is one that it's value corresponds to anything you might expect (ie, character set, line length, and other issues). Allowing null characters in particular is problematic for any code that uses null-terminated strings for messages or parts of messages, and would require changing the code everywhere to use and pass the length of all the strings instead. I don't know if there is a technical reason behind not supporting bare newlines or not. As far as STD0011 not being obsoleted, there are plenty of RFCs etc. that are not obsoleted by something but are still not best current practice. Clearly if the RFC it has been based on has been obsoleted, the STD should be updated as well. -- John A. Tamplin Unix System Administrator Emory University, School of Public Health +1 404/727-9931