Re: [Dovecot] Possible header parsing problem

2008-10-27 Thread Eric Stadtherr
On Tue, 28 Oct 2008 03:31:13 +0200, Timo Sirainen <[EMAIL PROTECTED]> wrote:
> On Oct 28, 2008, at 3:23 AM, Eric Stadtherr wrote:
> 
>>> Fixed: http://hg.dovecot.org/dovecot-1.1/rev/25b0cf7c62d3
>>>
>>> But I'm not sure if I should convert the following TAB to a space.
>>> UW-IMAP seems to do that, but RFC just says that the CRLF should be
>>> dropped.
>>
>> I grabbed a snapshot of the CM baseline with that fix, but that  
>> message
>> still doesn't display correctly. I ran it through the message_parser  
>> test
>> case and your fix look like it resulted in correct header values and
>> correct body parsing, but the BODYSTRUCTURE response from the server  
>> still
>> only contains the first part (plus the boundary name).
>>
>> Any suggestions where to look? I looked through the code that  
>> handles the
>> BODYSTRUCTURE fetch command and it looked like it eventually  
>> filtered down
>> to the same parser functions used by the test case, so I'm not sure  
>> where
>> else the problem could be introduced...
> 
> Did you delete dovecot.index.cache file? Otherwise it replies with the  
> cached value.

That was it, thanks!


-- 
Eric Stadtherr
[EMAIL PROTECTED]


Re: [Dovecot] Possible header parsing problem

2008-10-27 Thread Timo Sirainen

On Oct 28, 2008, at 3:23 AM, Eric Stadtherr wrote:


Fixed: http://hg.dovecot.org/dovecot-1.1/rev/25b0cf7c62d3

But I'm not sure if I should convert the following TAB to a space.
UW-IMAP seems to do that, but RFC just says that the CRLF should be
dropped.


I grabbed a snapshot of the CM baseline with that fix, but that  
message
still doesn't display correctly. I ran it through the message_parser  
test

case and your fix look like it resulted in correct header values and
correct body parsing, but the BODYSTRUCTURE response from the server  
still

only contains the first part (plus the boundary name).

Any suggestions where to look? I looked through the code that  
handles the
BODYSTRUCTURE fetch command and it looked like it eventually  
filtered down
to the same parser functions used by the test case, so I'm not sure  
where

else the problem could be introduced...


Did you delete dovecot.index.cache file? Otherwise it replies with the  
cached value.




PGP.sig
Description: This is a digitally signed message part


Re: [Dovecot] Possible header parsing problem

2008-10-27 Thread Eric Stadtherr
On Thu, 23 Oct 2008 19:06:19 +0300, Timo Sirainen <[EMAIL PROTECTED]> wrote:
> On Wed, 2008-10-22 at 20:59 -0600, Eric Stadtherr wrote:
>> Content-Type: multipart/alternative; boundary="=_alternative
>>   006F3A73872574E8_="
> 
> Is there one space, two spaces or a TAB at the beginning of the second
> line?
> 
>> I did a little bit of tracing through the parsing code 
>> (message-header-parser.c:message_parse_header_next()) and it appeared 
>> that the boundary in the Content-Type header was not parsed correctly, 
>> evidently because the header line was folded in the middle of the 
>> boundary string. RFC 822 appears to allow folding in a quoted string 
>> like this (§3.3 "quoted-string"), so I'm curious whether the parsing is

>> working correctly.
> 
> Fixed: http://hg.dovecot.org/dovecot-1.1/rev/25b0cf7c62d3
> 
> But I'm not sure if I should convert the following TAB to a space.
> UW-IMAP seems to do that, but RFC just says that the CRLF should be
> dropped.

I grabbed a snapshot of the CM baseline with that fix, but that message
still doesn't display correctly. I ran it through the message_parser test
case and your fix look like it resulted in correct header values and
correct body parsing, but the BODYSTRUCTURE response from the server still
only contains the first part (plus the boundary name).

Any suggestions where to look? I looked through the code that handles the
BODYSTRUCTURE fetch command and it looked like it eventually filtered down
to the same parser functions used by the test case, so I'm not sure where
else the problem could be introduced...



-- 
Eric Stadtherr
[EMAIL PROTECTED]


Re: [Dovecot] Possible header parsing problem

2008-10-24 Thread Timo Sirainen
On Fri, 2008-10-24 at 14:37 +0200, Jakob Hirsch wrote:
> Timo Sirainen wrote:
> 
> >> lead to strange behaviour. So I'd vote for replacing the folding tab 
> >> to a space.
> > Actually Dovecot already replaces all tabs to spaces when sending 
> > ENVELOPE, BODY and BODYSTRUCTURE replies. The only issue here is about 
> > the internal parsing where I think it's better to be strict.
> 
> Oh, ok, then I got that wrong.
> 
> I only wonder why I still see tabs in the Subject field in TB's message 
> list (and in other lines in message source). Using v1.2.alpha3.

Because TB most likely doesn't use ENVELOPE but parses the headers
itself.



signature.asc
Description: This is a digitally signed message part


Re: [Dovecot] Possible header parsing problem

2008-10-24 Thread Jakob Hirsch

Timo Sirainen wrote:

lead to strange behaviour. So I'd vote for replacing the folding tab 
to a space.
Actually Dovecot already replaces all tabs to spaces when sending 
ENVELOPE, BODY and BODYSTRUCTURE replies. The only issue here is about 
the internal parsing where I think it's better to be strict.


Oh, ok, then I got that wrong.

I only wonder why I still see tabs in the Subject field in TB's message 
list (and in other lines in message source). Using v1.2.alpha3.




Re: [Dovecot] Possible header parsing problem

2008-10-24 Thread Timo Sirainen

On Oct 24, 2008, at 12:35 PM, Jakob Hirsch wrote:


Timo Sirainen wrote:


But I'm not sure if I should convert the following TAB to a space.
UW-IMAP seems to do that, but RFC just says that the CRLF should be
dropped.


As pointed out in https://bugzilla.mozilla.org/show_bug.cgi? 
id=240924#c7, this could lead to strange behaviour. So I'd vote for  
replacing the folding tab to a space.


Actually Dovecot already replaces all tabs to spaces when sending  
ENVELOPE, BODY and BODYSTRUCTURE replies. The only issue here is about  
the internal parsing where I think it's better to be strict.




PGP.sig
Description: This is a digitally signed message part


Re: [Dovecot] Possible header parsing problem

2008-10-24 Thread Jakob Hirsch

Timo Sirainen wrote:


But I'm not sure if I should convert the following TAB to a space.
UW-IMAP seems to do that, but RFC just says that the CRLF should be
dropped.


As pointed out in 
https://bugzilla.mozilla.org/show_bug.cgi?id=240924#c7, this could lead 
to strange behaviour. So I'd vote for replacing the folding tab to a space.




Re: [Dovecot] Possible header parsing problem

2008-10-23 Thread Eric Stadtherr
On Thu, 23 Oct 2008 19:06:19 +0300, Timo Sirainen <[EMAIL PROTECTED]> wrote:
> On Wed, 2008-10-22 at 20:59 -0600, Eric Stadtherr wrote:
>> Content-Type: multipart/alternative; boundary="=_alternative
>>   006F3A73872574E8_="
> 
> Is there one space, two spaces or a TAB at the beginning of the second
> line?
> 

There is one space at the beginning of the continuation line. The parsed
full_value basically looks like:
[multipart/alternative; boundary="=_alternative\n 006F3A73872574E8_="]

>> I did a little bit of tracing through the parsing code 
>> (message-header-parser.c:message_parse_header_next()) and it appeared 
>> that the boundary in the Content-Type header was not parsed correctly, 
>> evidently because the header line was folded in the middle of the 
>> boundary string. RFC 822 appears to allow folding in a quoted string 
>> like this (§3.3 "quoted-string"), so I'm curious whether the parsing is

>> working correctly.
> 
> Fixed: http://hg.dovecot.org/dovecot-1.1/rev/25b0cf7c62d3
> 
> But I'm not sure if I should convert the following TAB to a space.
> UW-IMAP seems to do that, but RFC just says that the CRLF should be
> dropped.

I always prefer strict adherence to the RFC, which says:

 The process of moving  from  this  folded   multiple-line
representation  of a header field to its single line represen-
tation is called "unfolding".  Unfolding  is  accomplished  by
regarding   CRLF   immediately  followed  by  a  LWSP-char  as
equivalent to the LWSP-char.

So, what you did looks good!


-- 
Eric Stadtherr
[EMAIL PROTECTED]


Re: [Dovecot] Possible header parsing problem

2008-10-23 Thread Timo Sirainen
On Wed, 2008-10-22 at 20:59 -0600, Eric Stadtherr wrote:
> Content-Type: multipart/alternative; boundary="=_alternative
>   006F3A73872574E8_="

Is there one space, two spaces or a TAB at the beginning of the second
line?

> I did a little bit of tracing through the parsing code 
> (message-header-parser.c:message_parse_header_next()) and it appeared 
> that the boundary in the Content-Type header was not parsed correctly, 
> evidently because the header line was folded in the middle of the 
> boundary string. RFC 822 appears to allow folding in a quoted string 
> like this (§3.3 "quoted-string"), so I'm curious whether the parsing is 
> working correctly.

Fixed: http://hg.dovecot.org/dovecot-1.1/rev/25b0cf7c62d3

But I'm not sure if I should convert the following TAB to a space.
UW-IMAP seems to do that, but RFC just says that the CRLF should be
dropped.


signature.asc
Description: This is a digitally signed message part


[Dovecot] Possible header parsing problem

2008-10-22 Thread Eric Stadtherr

Hi,

I ran into a problem wherein my mail client (RoundCube) would not 
display a message from a Dovecot IMAP server (claiming that the message 
had no content). The raw source of the message looked fine, but the body 
structure returned by Dovecot only had the first text/plain part and not 
the alternative text/html part. The message looks like:


   ... headers removed ...
   X-Mailer: Lotus Notes Release 6.5.1 January 21, 2004
   Message-ID: <...>
   From: [EMAIL PROTECTED]
   Date: Mon, 20 Oct 2008 14:15:55 -0600
   Content-Type: multipart/alternative; boundary="=_alternative
 006F3A73872574E8_="

   This is a multipart message in MIME format.
   --=_alternative 006F3A73872574E8_=
   Content-Transfer-Encoding: 7bit
   Content-Type: text/plain;
 charset=us-ascii

   blah blah blah

   --=_alternative 006F3A73872574E8_=
   Content-Transfer-Encoding: 7bit
   Content-Type: text/html;
 charset=us-ascii


   blah blah blah in HTML

   --=_alternative 006F3A73872574E8_=--


I did a little bit of tracing through the parsing code 
(message-header-parser.c:message_parse_header_next()) and it appeared 
that the boundary in the Content-Type header was not parsed correctly, 
evidently because the header line was folded in the middle of the 
boundary string. RFC 822 appears to allow folding in a quoted string 
like this (§3.3 "quoted-string"), so I'm curious whether the parsing is 
working correctly.


Thanks for your help!

Here is my Dovecot information:
version: 1.1.4
"dovecot -n" output:
# 1.1.4: /usr/local/etc/dovecot.conf
Warning: fd limit 256 is lower than what Dovecot can use under full load 
(more than 384). Either grow the limit or change 
login_max_processes_count and max_mail_processes settings

base_dir: /var/dovecot/
info_log_path: /var/log/dovecot.log
listen: *, [::]
ssl_cert_file: /System/Library/OpenSSL/certs/imapd.pem
ssl_key_file: /System/Library/OpenSSL/certs/privkey.out
login_dir: /var/dovecot/login
login_executable: /usr/local/libexec/dovecot/imap-login
max_mail_processes: 256
mail_location: maildir:%h/Maildir
namespace:
  type: private
  separator: /
  inbox: yes
  list: yes
  subscriptions: yes
namespace:
  type: shared
  separator: /
  prefix: Shared/
  location: maildir:/Users/Shared/Maildir
  list: yes
  subscriptions: yes
auth default:
  passdb:
driver: pam
args: imap
  userdb:
driver: passwd



--
*Eric Stadtherr*
[EMAIL PROTECTED]