Re: [Dovecot] Embedded From_ lines breaking Content-Length (and Dovecot)

2008-05-04 Thread Timo Sirainen
On Sun, 2008-05-04 at 21:56 +0300, Timo Sirainen wrote:
> On Sun, 2008-05-04 at 16:59 +0300, Timo Sirainen wrote:
> > On Tue, 2008-01-29 at 04:09 +0100, Lennart Lövstrand wrote:
> > > I feel like I'm going totally crazy.  Is it just me, or have embedded  
> > > From_ lines really been breaking mbox messages since (at least)  
> > > dovecot 1.0?
> > 
> > Finally fixed in v1.1:
> > http://hg.dovecot.org/dovecot-1.1/rev/7871b6219480
> 
> And another fix to make that change not break:
> http://hg.dovecot.org/dovecot-1.1/rev/80d827b411c8

And one more: http://hg.dovecot.org/dovecot-1.1/rev/e935b36b8b65

Now it appears to be working in my stress tests.



signature.asc
Description: This is a digitally signed message part


Re: [Dovecot] Embedded From_ lines breaking Content-Length (and Dovecot)

2008-05-04 Thread Timo Sirainen
On Sun, 2008-05-04 at 16:59 +0300, Timo Sirainen wrote:
> On Tue, 2008-01-29 at 04:09 +0100, Lennart Lövstrand wrote:
> > I feel like I'm going totally crazy.  Is it just me, or have embedded  
> > From_ lines really been breaking mbox messages since (at least)  
> > dovecot 1.0?
> 
> Finally fixed in v1.1:
> http://hg.dovecot.org/dovecot-1.1/rev/7871b6219480

And another fix to make that change not break:
http://hg.dovecot.org/dovecot-1.1/rev/80d827b411c8



signature.asc
Description: This is a digitally signed message part


Re: [Dovecot] Embedded From_ lines breaking Content-Length (and Dovecot)

2008-05-04 Thread Timo Sirainen
On Tue, 2008-01-29 at 04:09 +0100, Lennart Lövstrand wrote:
> I feel like I'm going totally crazy.  Is it just me, or have embedded  
> From_ lines really been breaking mbox messages since (at least)  
> dovecot 1.0?

Finally fixed in v1.1:
http://hg.dovecot.org/dovecot-1.1/rev/7871b6219480



signature.asc
Description: This is a digitally signed message part


Re: [Dovecot] Embedded From_ lines breaking Content-Length (and Dovecot)

2008-01-29 Thread Lennart Lövstrand

On Jan 29, 2008, at 07:50, Dean Brooks wrote:


On Tue, Jan 29, 2008 at 04:09:26AM +0100, Lennart Lvstrand wrote:


I feel like I'm going totally crazy.  Is it just me, or have embedded
From_ lines really been breaking mbox messages since (at least)
dovecot 1.0?

It's trivial to reproduce too -- just mail yourself a message with a
valid From_ line in it (assuming that your delivery system isn't  
doing

From-escaping), or put it in a draft plain text message and save

it.  Then go and look in your Drafts folder...


The "mbox" format, by definition, uses From_ lines as *the* separator.
If it uses anything else, it's not conventional mbox format.  There
are variants of mbox, sometimes described as mboxcl2 that use
Content-Length: as the defining separator, but that is *not*
conventional mbox format.


Dovecot's support for the mbox format is described in http://wiki.dovecot.org/MailboxFormat/mbox 
.  It is capable of parsing From_ lines as message delimiters, but  
prefers the Content-Length header and will add one if not present (or  
rewrite an old one if found incorrect).  I guess you might want to  
call that mboxcl2 with fallback to traditional mbox to be more specific.


You're correct that solely relying on the presence of (correct) CL  
headers can be risky in an environment where old mbox tools may  
rewrite the mailboxes, but let us disregard that for a moment.  It is  
not a concern of mine as I only run one type of IMAP server at the  
time and I tend to let it manage all my mailboxes exclusively.  (And  
yes, I have good control over the incoming mail delivery method as  
well.)


I didn't mean to start a discussion on mailbox formats.  The point of  
my message was that Dovecot effectively seems to be doing the fallback  
backwards, i.e. is causing From_ lines to break up messages with  
explicit and correct Content-Length headers.  I first thought there  
might be something wrong with my mbox files or my delivery agent, but  
this happens even for new messages that are created directly from  
within IMAP.


This might be a better way to illustrate it:


$ telnet localhost imap
* OK [CAPABILITY IMAP4rev1 SASL-IR SORT THREAD=REFERENCES  
MULTIAPPEND UNSELECT LITERAL+ IDLE CHILDREN NAMESPACE LOGIN- 
REFERRALS STARTTLS AUTH=PLAIN] Dovecot ready.

a login lennart xxx
a OK Logged in.
b create foo
b OK Create completed.
c append foo {205}
+ OK
From: Me
To: You
Subject; Foo
Date: Mon, 28 Jan 2008 23:50:00 +0100

Before

From someone  Mon Jan 28 23:51:00 2008
From: You
To: Me
Subject: Bar
Date: Mon, 28 Jan 2008 23:52:00 +0100

After

c OK Append completed.
d select foo
* FLAGS (\Answered \Flagged \Deleted \Seen \Draft)
* OK [PERMANENTFLAGS (\Answered \Flagged \Deleted \Seen \Draft \*)]  
Flags permitted.

* 2 EXISTS
* 0 RECENT
* OK [UNSEEN 1] First unseen.
* OK [UIDVALIDITY 1201610707] UIDs valid
* OK [UIDNEXT 3] Predicted next UID
d OK [READ-WRITE] Select completed.
e fetch 1 body[text]
* 1 FETCH (FLAGS (\Seen) BODY[TEXT] {8}
Before
)
e OK Fetch completed.
f fetch 2 body[text]
* 2 FETCH (FLAGS (\Seen) BODY[TEXT] {7}
After
)
f OK Fetch completed.


Note: Two messages were created in the foo mailbox instead of just one.

After the APPEND command was executed, the foo mailbox has the  
following expected contents:



From [EMAIL PROTECTED]  Tue Jan 29 14:18:01 2008
From: Me
To: You
Subject; Foo
Date: Mon, 28 Jan 2008 23:50:00 +0100
X-IMAPbase: 1201612681 01
X-UID: 1
Status:
X-Keywords:
Content-Length: 122

Before

From someone  Mon Jan 28 23:51:00 2008
From: You
To: Me
Subject: Bar
Date: Mon, 28 Jan 2008 23:52:00 +0100

After


But after the SELECT command was issued, it is clear that Dovecot now  
thinks there are two messages given the two X-UID & Status headers --  
despite there only being a single Content-Length header (now incorrect  
because the added headers in the embedded message):



From [EMAIL PROTECTED]  Tue Jan 29 14:18:01 2008
From: Me
To: You
Subject; Foo
Date: Mon, 28 Jan 2008 23:50:00 +0100
X-IMAPbase: 1201612681 02
X-UID: 1
Status: O
X-Keywords:
Content-Length: 122

Before

From someone  Mon Jan 28 23:51:00 2008
From: You
To: Me
Subject: Bar
Date: Mon, 28 Jan 2008 23:52:00 +0100
X-UID: 2
Status: O

After


This is reproducable under every version from Dovecot 1.0 to  
1.1.beta14 that I've tried, but with the added caveat that 1.1.beta14  
sometimes seems to get confused when it finds a truncated message and  
may abort mid stream.


Cheers,
--Lennart



Re: [Dovecot] Embedded From_ lines breaking Content-Length (and Dovecot)

2008-01-28 Thread Dean Brooks
On Tue, Jan 29, 2008 at 04:09:26AM +0100, Lennart Lvstrand wrote:

> I feel like I'm going totally crazy.  Is it just me, or have embedded  
> From_ lines really been breaking mbox messages since (at least)  
> dovecot 1.0?
> 
> It's trivial to reproduce too -- just mail yourself a message with a  
> valid From_ line in it (assuming that your delivery system isn't doing  
> >From-escaping), or put it in a draft plain text message and save  
> it.  Then go and look in your Drafts folder...

The "mbox" format, by definition, uses From_ lines as *the* separator.
If it uses anything else, it's not conventional mbox format.  There
are variants of mbox, sometimes described as mboxcl2 that use
Content-Length: as the defining separator, but that is *not*
conventional mbox format.

I'm sure others will clarify Dovecot's stance on this, but relying on
Content-Length: headers as the sole source of determining message
separation is VERY risky business unless you make absolutely sure that
the values given are 100% correct and that all software touching the
mailbox are also in agreement (i.e. POP daemons, UNIX readers, procmail,
other IMAP daemons, etc.)

Because Dovecot cannot control existing values of Content-Length
headers, that seems to be an extremely risky proposition.

--
Dean Brooks
[EMAIL PROTECTED]


[Dovecot] Embedded From_ lines breaking Content-Length (and Dovecot)

2008-01-28 Thread Lennart Lövstrand
I feel like I'm going totally crazy.  Is it just me, or have embedded  
From_ lines really been breaking mbox messages since (at least)  
dovecot 1.0?  I found a whole lot of broken messages in an old mailbox  
of mine and when I looked closer at it, it seemed like Dovecot was  
ignoring the Content-Length header and truncating the messages at the  
first embedded From_ line instead.  Figuring that this probably just  
was a fluke / old bug, I erased the index and upgraded Dovecot, first  
to 1.0.10, then to 1.1.beta14, but I'm still seeing the bug -- and  
it's even gotten worse!  Earlier versions would just quietly split the  
message at the From_ line, but 1.1.beta14 is more diligent and will  
complain with:


dovecot[44795]: IMAP(lennart): FETCH for mailbox Drafts UID 196 got  
too little data: 590 vs 719


before dropping the connection and exiting... which cause Thunderbird  
to reopen the connection and reissue the same request... which will  
cause Dovecot to complain and exit again... etc ad nauseam...


It's trivial to reproduce too -- just mail yourself a message with a  
valid From_ line in it (assuming that your delivery system isn't doing  
>From-escaping), or put it in a draft plain text message and save  
it.  Then go and look in your Drafts folder...


(Like this one: "From someone Tue Jan 29 03:15:52 2008")

Just be sure that you can stop your spinning mail client and/or server  
if you're using 1.1.beta14...


---

OK, I am definitely not very familiar with the Dovecot source, but I  
spent some time trying to track this down today and it looks to me  
like i_stream_raw_mbox_read is being asked to find the end of the  
message (at the From_ line) before mbox_sync_parse_next_mail has had a  
chance to parse the Content-Length header and find out how big it is.   
Maybe someone who knows the code better can take a look at it and tell  
me if I'm barking up the totally wrong tree...


Thanks,
--Lennart