Re: [Dovecot] CATENATE/literal8 issue

2013-06-13 Thread Michael M Slusarz

Quoting Timo Sirainen t...@iki.fi:


On Wed, 2013-05-22 at 09:38 -0600, Michael M Slusarz wrote:

Quoting Michael M Slusarz slus...@curecanti.org:

 Quoting Timo Sirainen t...@iki.fi:

 Anyway .. the BINARY APPEND converts only the MIME parts that you
 send with Content-Transfer-Encoding: binary. Are you sending such
 header to Dovecot?

I can verify this isn't working as you described above:

1 APPEND INBOX CATENATE (TEXT {49+}
Content-Type: multipart/alternative; boundary=A TEXT ~{1}
1 NO [UNKNOWN-CTE] Binary input allowed only when the first part is binary.


What do you do then if server advertises CATENATE but not BINARY?


Send as a regular literal.  If there truly are nulls in the output,  
there's not much we can do so we send as-is and hope for the best.



Anyway for the other possibilities Dovecot could:

a) Put all CATENATEd messages through the istream-binary-converter, but
just not do any actual C-T-E:binary conversion until the first ~{binary}
part is found.

b) Just treat ~{n} exactly the same as ~{n}, unless it's the first part
of CATENATE.

Maybe this should be aked about in IMAP mailing list .. (Didn't I
already ask something about CATENATE+BINARY combination?)


Yeah:  
http://mailman2.u.washington.edu/pipermail/imap-protocol/2012-June/001787.html  
  No responses :)


It is concerning because RFC 4466 indicates that literal8's are  
allowed for both APPEND and MULTIAPPEND, which is essentially an  
extended APPEND.  But RFC 4469 defines CATENATE TEXT as literal only:


RFC 4466:
   append-data = literal / literal8 / append-data-ext

RFC 4469:
   append-data =/ CATENATE SP ( cat-part *(SP cat-part) )
   cat-part = text-literal / url
   text-literal = TEXT SP literal

To me CATENATE =~ MULTIAPPEND - it is just another form of an extended  
APPEND.  Not sure why it shouldn't be allowed there.  But from a  
strict ABNF standpoint, you are correct that I shouldn't be sending  
literal8's.  I'll ask myself on the IMAP list why this design choice  
was made.


For the record... given the varying levels of BINARY support in  
different IMAP servers (UW IMAP is flat-out broken), I've gone ahead  
and bit the bullet and we now pre-scan outgoing append literals for  
null characters and only use literal8's when absolutely necessary.  I  
was probably being too clever for my own good in assuming that I can  
just send and assume the server will handle all issues.


With that being said... I was able to reliably reproduce a parsing  
issue in Dovecot 2.2.x when doing a MULTIAPPEND w/literal8's.  I need  
to track down if this is a single message causing the issue or some  
sort of cumulative bug that only appears once you've done something  
like 200-300 sequential appends.  I can verify that a switch from  
literal8 - literal fixes the issue.  I'll try to create a  
reproducible test case.


michael



Re: [Dovecot] CATENATE/literal8 issue

2013-06-13 Thread Michael M Slusarz

Quoting Michael M Slusarz slus...@curecanti.org:

It is concerning because RFC 4466 indicates that literal8's are  
allowed for both APPEND and MULTIAPPEND, which is essentially an  
extended APPEND.  But RFC 4469 defines CATENATE TEXT as literal only:


RFC 4466:
   append-data = literal / literal8 / append-data-ext

RFC 4469:
   append-data =/ CATENATE SP ( cat-part *(SP cat-part) )
   cat-part = text-literal / url
   text-literal = TEXT SP literal

To me CATENATE =~ MULTIAPPEND - it is just another form of an  
extended APPEND.  Not sure why it shouldn't be allowed there.


Answered my own question here - sure enough, it was an oversight:

http://osdir.com/ml/ietf.imapext/2006-03/msg00030.html

michael



Re: [Dovecot] CATENATE/literal8 issue

2013-06-12 Thread Timo Sirainen
On Wed, 2013-05-22 at 09:38 -0600, Michael M Slusarz wrote:
 Quoting Michael M Slusarz slus...@curecanti.org:
 
  Quoting Timo Sirainen t...@iki.fi:
 
  Anyway .. the BINARY APPEND converts only the MIME parts that you  
  send with Content-Transfer-Encoding: binary. Are you sending such  
  header to Dovecot?
 
 I can verify this isn't working as you described above:
 
 1 APPEND INBOX CATENATE (TEXT {49+}
 Content-Type: multipart/alternative; boundary=A TEXT ~{1}
 1 NO [UNKNOWN-CTE] Binary input allowed only when the first part is binary.

What do you do then if server advertises CATENATE but not BINARY?

Anyway for the other possibilities Dovecot could:

a) Put all CATENATEd messages through the istream-binary-converter, but
just not do any actual C-T-E:binary conversion until the first ~{binary}
part is found.

b) Just treat ~{n} exactly the same as ~{n}, unless it's the first part
of CATENATE.

Maybe this should be aked about in IMAP mailing list .. (Didn't I
already ask something about CATENATE+BINARY combination?)



Re: [Dovecot] CATENATE/literal8 issue

2013-05-22 Thread Michael M Slusarz

Quoting Michael M Slusarz slus...@curecanti.org:


Quoting Timo Sirainen t...@iki.fi:

Anyway .. the BINARY APPEND converts only the MIME parts that you  
send with Content-Transfer-Encoding: binary. Are you sending such  
header to Dovecot?


I can verify this isn't working as you described above:

1 APPEND INBOX CATENATE (TEXT {49+}
Content-Type: multipart/alternative; boundary=A TEXT ~{1}
1 NO [UNKNOWN-CTE] Binary input allowed only when the first part is binary.

michael



[Dovecot] CATENATE/literal8 issue

2013-05-21 Thread Michael M Slusarz

Using 2.2.2, I see this:

C: 6 APPEND INBOX (\seen) 16-May-2013 22:05:14 -0600 CATENATE (URL  
/INBOX;UIDVALIDITY=1255685337/;UID=48812/;SECTION=HEADER TEXT ~{40}

S: 6 NO [UNKNOWN-CTE] Binary input allowed only when the first part is binary.

Why is there this limitation?  It seems to me that CATENATE is  
confusing the content-type encoding of the data/part itself with the  
encoding of the IMAP literal.


A literal 8 is nothing more than a series of OCTET's that *may*  
contain nulls, but not necessarily.  i.e., in the above example the 40  
octets of data are US-ASCII text, which is perfectly acceptable to  
send as a literal8.  (Client rationale: If BINARY exists on the  
server, we don't bother to scan IMAP literal's for null data -- we  
just send them as literal8's.  It's an optimization that I would hate  
to get rid of.)


michael



Re: [Dovecot] CATENATE/literal8 issue

2013-05-21 Thread Timo Sirainen
On 21.5.2013, at 9.40, Michael M Slusarz slus...@curecanti.org wrote:

 Using 2.2.2, I see this:
 
 C: 6 APPEND INBOX (\seen) 16-May-2013 22:05:14 -0600 CATENATE (URL 
 /INBOX;UIDVALIDITY=1255685337/;UID=48812/;SECTION=HEADER TEXT ~{40}
 S: 6 NO [UNKNOWN-CTE] Binary input allowed only when the first part is binary.
 
 Why is there this limitation?  It seems to me that CATENATE is confusing the 
 content-type encoding of the data/part itself with the encoding of the IMAP 
 literal.
 
 A literal 8 is nothing more than a series of OCTET's that *may* contain 
 nulls, but not necessarily.  i.e., in the above example the 40 octets of data 
 are US-ASCII text, which is perfectly acceptable to send as a literal8.  
 (Client rationale: If BINARY exists on the server, we don't bother to scan 
 IMAP literal's for null data -- we just send them as literal8's.  It's an 
 optimization that I would hate to get rid of.)

Well, the problem is that if it does contain NULs, the MIME part needs to be 
converted to something that doesn't. And to do that it needs to modify the 
previous header, which with current code was already read.. So to fix that it 
would need to read the whole message into a temporary file before actually 
saving it, which makes performance worse for the normal case..

Or are you saying that the error is fine if the text contains NULs, but simply 
should be allowed as long as it doesn't?



Re: [Dovecot] CATENATE/literal8 issue

2013-05-21 Thread Michael M Slusarz

Quoting Timo Sirainen t...@iki.fi:


On 21.5.2013, at 9.40, Michael M Slusarz slus...@curecanti.org wrote:


Using 2.2.2, I see this:

C: 6 APPEND INBOX (\seen) 16-May-2013 22:05:14 -0600 CATENATE  
(URL /INBOX;UIDVALIDITY=1255685337/;UID=48812/;SECTION=HEADER  
TEXT ~{40}
S: 6 NO [UNKNOWN-CTE] Binary input allowed only when the first part  
is binary.


Why is there this limitation?  It seems to me that CATENATE is  
confusing the content-type encoding of the data/part itself with  
the encoding of the IMAP literal.


A literal 8 is nothing more than a series of OCTET's that *may*  
contain nulls, but not necessarily.  i.e., in the above example the  
40 octets of data are US-ASCII text, which is perfectly acceptable  
to send as a literal8.  (Client rationale: If BINARY exists on the  
server, we don't bother to scan IMAP literal's for null data -- we  
just send them as literal8's.  It's an optimization that I would  
hate to get rid of.)


Well, the problem is that if it does contain NULs, the MIME part  
needs to be converted to something that doesn't. And to do that it  
needs to modify the previous header, which with current code was  
already read..


Is altering the header something that BINARY/CATENATE is allowed to  
do?  Especially regarding the header.  I know there is language about  
the server changing the CTE, but this is potentially troubling since  
cryptographic signatures may rely on the header text.  Changing things  
will break the message.


I can see the server altering the body text to match the header.  But  
I think the reverse is bothersome.


Or are you saying that the error is fine if the text contains NULs,  
but simply should be allowed as long as it doesn't?


This.  As mentioned before, it seems the code is simply assuming that  
the text part contains NULs without ever checking it.  My reading of  
the literal8 is that there is no requirement that NULs MUST exist in  
the string.


In our code, the append data is often from code that the IMAP library  
doesn't have access to.  So at APPEND time, it is unaware whether the  
data contains NUL or not - it just has a blob of data and a length.   
If BINARY exists, it is much easier for us to simply send as literal8  
and stream the data - no extra overhead is needed on our side.  Since  
each individual byte need to be handled by the server as it comes in,  
it seems much more efficient to do NUL checking there.


michael



Re: [Dovecot] CATENATE/literal8 issue

2013-05-21 Thread Timo Sirainen
On 21.5.2013, at 21.24, Michael M Slusarz slus...@curecanti.org wrote:

 Or are you saying that the error is fine if the text contains NULs, but 
 simply should be allowed as long as it doesn't?
 
 This.  As mentioned before, it seems the code is simply assuming that the 
 text part contains NULs without ever checking it.  My reading of the literal8 
 is that there is no requirement that NULs MUST exist in the string.
 
 In our code, the append data is often from code that the IMAP library doesn't 
 have access to.  So at APPEND time, it is unaware whether the data contains 
 NUL or not - it just has a blob of data and a length.  If BINARY exists, it 
 is much easier for us to simply send as literal8 and stream the data - no 
 extra overhead is needed on our side.  Since each individual byte need to be 
 handled by the server as it comes in, it seems much more efficient to do NUL 
 checking there.

It's not just about NUL. It's also about if plain LFs can be converted to CRLFs.

Anyway .. the BINARY APPEND converts only the MIME parts that you send with 
Content-Transfer-Encoding: binary. Are you sending such header to Dovecot? If 
not, there's actually no difference to a regular APPEND from Dovecot's point of 
view (I think). If a non-binary MIME part contains NUL, what is Dovecot 
supposed to do? Change it to some other character? Fail the APPEND? Should 
there be a difference between how literal vs literal8 is handled in such case?



Re: [Dovecot] CATENATE/literal8 issue

2013-05-21 Thread Michael M Slusarz

Quoting Timo Sirainen t...@iki.fi:

Anyway .. the BINARY APPEND converts only the MIME parts that you  
send with Content-Transfer-Encoding: binary. Are you sending such  
header to Dovecot?


I don't think so.  I noticed the CATENATE error when I was stripping a  
simple text/html part out of a multipart/alternative message.  The  
master message header has a single MIME header:


Content-Type: multipart/alternative;  
boundary=WPFVNCCY4GPWDK6HNJXHWWE7J94BSS


For the record, here's the entire transaction, along with the fallback  
APPEND w/out using literal8 that was successful on the identical data:


C: 6 APPEND INBOX (\seen) 16-May-2013 22:05:14 -0600 CATENATE (URL  
/INBOX;UIDVALIDITY=1255685337/;UID=48812/;SECTION=HEADER TEXT ~{40}

S: 6 NO [UNKNOWN-CTE] Binary input allowed only when the first part is binary.
C: 8 APPEND INBOX (\seen) 16-May-2013 22:05:14 -0600 CATENATE (URL  
/INBOX;UIDVALIDITY=1255685337/;UID=48812/;SECTION=HEADER TEXT {40+}

C: [LITERAL DATA: 40 bytes]
C:  URL /INBOX;UIDVALIDITY=1255685337/;UID=48812/;SECTION=1.MIME URL  
/INBOX;UIDVALIDITY=1255685337/;UID=48812/;SECTION=1 TEXT {40+}

C: [LITERAL DATA: 40 bytes]
C:  TEXT {113+}
C: [LITERAL DATA: 113 bytes]
C:  TEXT {42+}
C: [LITERAL DATA: 42 bytes]
C: )
S: 8 OK [APPENDUID 1255685337 48885] Append completed.

If a non-binary MIME part contains NUL, what is Dovecot supposed to  
do? Change it to some other character? Fail the APPEND? Should there  
be a difference between how literal vs literal8 is handled in such  
case?


I would say there is no doubt: fail the APPEND.  It should be the  
client's responsibility to correctly format the data.


I appreciate that Dovecot does its best to try to Do The Right Thing  
(Cyrus is much stricter about input, for example).  But at some point  
us client authors have to be at least somewhat competent, and it is  
not asking to much for us to accept that GIGO.


michael