Re: *IMPORTANT* - bugfix sync_append_commit index breakage

2007-09-02 Thread Bron Gondwana
On Sun, Sep 02, 2007 at 01:33:11PM +1000, Rob Mueller wrote:

 I believe so.  If we also used it for appends, we wouldn't run into the
 problem that Fastmail is seeing.  But, it may be too much of a
 performance hit.

 I don't think that any of the code does the .NEW for appends because that
 makes append O(N) where N is the mailbox size, and that would suck for
 huge mailboxes.

 Definitely. Rewriting the cyrus.index file for every message delivered to a 
 mailbox would be horrible, especially when some users have 100,000+ 
 messages in a mailbox.

 Speaking of which, is there any reason why cyrus.expunge isn't sorted
 by UID?  We have to rewrite the entire cyrus.index each time we do an
 expunge anyway.

 I'm guessing it's because messages can be deleted + expunged in any order, 
 so it's easy just to append the record for any message being expunged to 
 the end of the cyrus.expunge file. If you had to keep it sorted, that would 
 mean every time you expunged a message, you'd have to reread, resort, and 
 rewrite the cyrus.expunge file. Generally the cyrus.expunge will be a lot 
 smaller than the cyrus.index, but still annoying...

Well, you are rereading (not sorting) the cyrus.index file every time
you do an expunge anyway.

A faster alternative would be to combine the two files again, have two
different counts (exists, n_records) - where exists may be smaller.
Also, you'd need to set aside a server flag for really, really
expunged in each index record.

There's the case of non-UID numbered fetch not working so well any more,
but honestly, who sane uses that in this concurrent world?

It's a way invasive patch though.

Bron.


Re: *IMPORTANT* - bugfix sync_append_commit index breakage

2007-09-02 Thread Bron Gondwana
On Sat, Sep 01, 2007 at 11:48:51PM -0400, Ken Murchison wrote:
 Bron Gondwana wrote:
 Sorted by UUID would help enormously for things like not blatting the
 \Seen state for expunged messages until they're expired, because you
 could then just do a double-iterator over the two index format files.

 Seen state gets goofed up on unexpunge because we don't keep the original 
 UID.  I wanted to, but re-injecting a message in the middle of a mailbox 
 with a UID  UIDNEXT was either a violation of RFC3501 and/or seriously 
 screwed with clients.

Yeah, true.  Actually, that's not the issue though.  The issue is that
seen state is stored as a range list, and the range list is rebuild by
walking the cyrus.index file.  If there's no record there, then it gets
subsumed into the outer range, so seen state is actually lost before we
try to unexpunge.

An alternative (probably even dodgier for clients) is to just bump
UIDVALIDITY.  Tends to make you unpopular with people who have 2Gb in
their mailbox though.

Bron.


Re: *IMPORTANT* - bugfix sync_append_commit index breakage

2007-09-01 Thread Bron Gondwana

On Sat, 01 Sep 2007 13:08:00 -0400, Ken Murchison [EMAIL PROTECTED] said:
 David Carter wrote:
  On Sat, 1 Sep 2007, Ken Murchison wrote:
  
  I applied this patch to CVS, but I'm wondering why the original code 
  writes directly to the existing cyrus.index instead of a 
  cyrus.index.NEW, as is done in other places.  I don't know if this was 
  intentional by the previous authors, or an oversight.  I'm going to 
  investigate.
  
  Doesn't cyrus.index.NEW only kick in on expunge?
 
 I believe so.  If we also used it for appends, we wouldn't run into the 
 problem that Fastmail is seeing.  But, it may be too much of a 
 performance hit.

Speaking of which, is there any reason why cyrus.expunge isn't sorted
by UID?  We have to rewrite the entire cyrus.index each time we do an
expunge anyway.

Sorted by UUID would help enormously for things like not blatting the
\Seen state for expunged messages until they're expired, because you
could then just do a double-iterator over the two index format files.

I don't think that any of the code does the .NEW for appends because that
makes append O(N) where N is the mailbox size, and that would suck for
huge mailboxes.

Bron.
-- 
  Bron Gondwana
  [EMAIL PROTECTED]



Re: *IMPORTANT* - bugfix sync_append_commit index breakage

2007-09-01 Thread Rob Mueller



I believe so.  If we also used it for appends, we wouldn't run into the
problem that Fastmail is seeing.  But, it may be too much of a
performance hit.



I don't think that any of the code does the .NEW for appends because that
makes append O(N) where N is the mailbox size, and that would suck for
huge mailboxes.


Definitely. Rewriting the cyrus.index file for every message delivered to a 
mailbox would be horrible, especially when some users have 100,000+ messages 
in a mailbox.



Speaking of which, is there any reason why cyrus.expunge isn't sorted
by UID?  We have to rewrite the entire cyrus.index each time we do an
expunge anyway.


I'm guessing it's because messages can be deleted + expunged in any order, 
so it's easy just to append the record for any message being expunged to the 
end of the cyrus.expunge file. If you had to keep it sorted, that would mean 
every time you expunged a message, you'd have to reread, resort, and rewrite 
the cyrus.expunge file. Generally the cyrus.expunge will be a lot smaller 
than the cyrus.index, but still annoying...


Rob



Re: *IMPORTANT* - bugfix sync_append_commit index breakage

2007-08-31 Thread Ken Murchison
I think your patch makes sense, but I'm not sure when The cyrus index 
file format has this clever ignore junk at the end until the exists 
count changes trick means.  I know we leave junk in the cache file as a 
result of delayed expunge which gets cleaned up later, but I'm pretty 
sure the the index file is always tightly packed.  The is extra space in 
the index header, but there shouldn't be any between the index records.



Bron Gondwana wrote:

We've had issues with broken indexes on the old master after a failover,
and puzzled for ages over the cause.  This is the reason behind the
SIGQUIT patch which would mitigate this by reducing the unfinished
business on a standard shutdown.

But still, the symptoms were weird.

I'm pretty sure I've figured out the cause, and the attached patch
(compile tested only) fixes it...  unfortunately it's a race condition
so it's hard to test.

Ken - it would be great if you can check this today and push it to CVS
if you agree with me.  It's a nasty one - especially since the symptoms
are offset index records right until something that's not a sync_server
append comes along and writes to the correct place in the index file
again.

On a more structural note - it would be great to re-factor the lower
level APIs (mailbox.c, mboxlist.c, etc) to provide everything that
sync_server needs manipulate their data structures and remove all the
low level fiddling copy-and-pasted stuff from the sync layer.  This
comes back a bit to David's comments today about the difficulty of
having patches flying around - I suspect he copied and pasted so he
wasn't throwing tentacles deep through the main codebase - but the
end result is more brittle software and poor abstraction.  We need
to avoid doing that if there's a bunch of us all hacking in our own
little patch space!

Bron.




Bugfix sync index append

After a failover, we sometimes get index files
with duplicate records in them and other nastiness.

The cyrus index file format has this clever ignore
junk at the end until the exists count changes trick.

sync_commit.c contains a copy and pasted index append 
logic, but it seeks to the end of the file rather than

start_offset + exists * record_size.  This is bogus and
means that at some random future stage a real record will
be overwritten and two of the old bogus record will still
exist.  Crap.

This patch fixes that, obviously.
Index: cyrus-imapd-2.3.9/imap/sync_commit.c
===
--- cyrus-imapd-2.3.9.orig/imap/sync_commit.c   2007-08-31 10:13:28.0 
-0400
+++ cyrus-imapd-2.3.9/imap/sync_commit.c2007-08-31 10:26:03.0 
-0400
@@ -481,6 +481,7 @@
 unsigned long newflagged;
 char  target[MAX_MAILBOX_PATH];
 int   n, r = 0;
+long   last_offset;
 struct txn *tid = NULL;
 modseq_t highestmodseq = 0;
 
@@ -577,9 +578,11 @@

 if (retry_writev(mailbox-cache_fd, cache_iovec, upload_list-count)  0)
 goto fail;
 
-lseek(mailbox-index_fd, mailbox-index_size, SEEK_SET);
+
+last_offset = mailbox-start_offset + mailbox-exists * mailbox-record_size;

+lseek(mailbox-index_fd, last_offset, SEEK_SET);
 if (retry_write(mailbox-index_fd, index_chunk,
-upload_list-count * INDEX_RECORD_SIZE)  0)
+upload_list-count * mailbox-record_size)  0)
 goto fail;
 
 



--
Kenneth Murchison
Systems Programmer
Project Cyrus Developer/Maintainer
Carnegie Mellon University


Re: *IMPORTANT* - bugfix sync_append_commit index breakage

2007-08-31 Thread David Carter

On Fri, 31 Aug 2007, Ken Murchison wrote:

I think your patch makes sense, but I'm not sure when The cyrus index 
file format has this clever ignore junk at the end until the exists 
count changes trick means.  I know we leave junk in the cache file as a 
result of delayed expunge which gets cleaned up later, but I'm pretty 
sure the the index file is always tightly packed.  The is extra space in 
the index header, but there shouldn't be any between the index records.


I think that the danger is that if sync_server gets shut down uncleanly 
(which I know was happening to Fastmail a lot at one point) then then you 
can end up with a bogus entry at the end of a cyrus.index file which is 
not overwritten by the next sync_append_commit() on that mailbox.


The race condition is that the exists count in the header can only be 
updated after the index record has been written.


An explicit seek using mailbox-exists is definitely more robust, although 
it probably doesn't help if power fails halfway through the fsync() on the 
cyrus.index file after both updates have been made (data=journal maybe?)


--
David Carter Email: [EMAIL PROTECTED]
University Computing Service,Phone: (01223) 334502
New Museums Site, Pembroke Street,   Fax:   (01223) 334679
Cambridge UK. CB2 3QH.


Re: *IMPORTANT* - bugfix sync_append_commit index breakage

2007-08-31 Thread David Carter

On Sat, 1 Sep 2007, Bron Gondwana wrote:


On a more structural note - it would be great to re-factor the lower
level APIs (mailbox.c, mboxlist.c, etc) to provide everything that
sync_server needs manipulate their data structures and remove all the
low level fiddling copy-and-pasted stuff from the sync layer.  This
comes back a bit to David's comments today about the difficulty of
having patches flying around - I suspect he copied and pasted so he
wasn't throwing tentacles deep through the main codebase - but the
end result is more brittle software and poor abstraction.


Pretty much. I wasn't expecting the replication code to ever get merged 
when I first wrote it. All of the really awful stuff should be in 
sync_commit.c. Those are the routines which create messages with 
predetermined UIDs and mailboxes with predetermined UniqueIDs.


--
David Carter Email: [EMAIL PROTECTED]
University Computing Service,Phone: (01223) 334502
New Museums Site, Pembroke Street,   Fax:   (01223) 334679
Cambridge UK. CB2 3QH.