Re: [Dovecot] dbox redesign
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Thu, 12 Feb 2009, Allen Belletti wrote: I would add that having fewer, larger files should make backups much more feasible. There's a certain amount of overhead for each file That's true for full backups. I don't defend Maildir, esp. because it changes the filename, which is a new file for any backup software (which are usually not Maildir aware in my case). operation (especially for us GFS people!) and reducing the number of files will reduce that overhead. Also, makes partial recoveries problematic. Right now our backups (done via rsync) take a pretty scary amount of Yep. Bye, - -- Steffen Kaiser -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iQEVAwUBSZqWHHWSIuGy1ktrAQL7lggAnWUfIIPwVj3xV4csIVl0ayVCn2lgBGzG lRIg+OzGRbpZx9uvwJpPRtJS7TphFTBmctvdvL22NROGaXJh0bmvdfgeXnUf+IG/ YJqNEGN5j/yUAHON4l9hMnv9JjgWwSIFKCUZJ7MFVJpohpPLXJoxDt+AYyb+d+44 GImHDgpcyr0089Asv6FN8Q4rzGIxQAvdIx/n/nMeQ77ZVnbTJDtrcuNUywcV7Hqq lYbEX83ikq206QSJjmwM1j6w5n+PAsHWE8UJdmmpP/7vemsg3KDVkhaMcCfhLyL4 FqDZOwuhhsVEjykLfgbx6onJ1bon7u987eqJt5yv1NYRGU+NoCVUDQ== =+s98 -END PGP SIGNATURE-
Re: [Dovecot] dbox redesign
On Thu, 2009-02-12 at 11:29 +0100, Mikkel wrote: > Hi Timo > > I have a few comments. Please just disregard them if I have > misunderstood your design. > > Regarding your storage plan > I find it very important that users can be stored in different locations > because: This you misunderstood. The mails of a single user are stored in one dbox directory, not all users. > Regarding 7. > I very much for all the self healing you describe. > There is nothing worse than huge complex systems that fail just because > of some minor error that could easily be fixed without manual intervention. > But also I'm a little worried in this regard. > > Maildir is so robust that nothing can really go wrong. Yes. If you don't care that much about performance Maildir is going to be more reliable, especially when recovering from filesystem corruption. > It should be very resilient to temporarily losing access to all files in > this operation (could happen very often on NFS mounts). I/O errors and such are treated differently than corrupted/missing files. So as long as reading gives an error it doesn't try to repair anything. > Also I imagine the self-healing going into loops if it doesn't > understand what’s going on. > If the data changes dues to manual intervention or par of the file > system can be accessed you could imagine the self healing process trying > again and again to fix something that isn't its job to fix. > In that case it would be better if it just skipped the apparent failures. I'm not really sure what you're thinking about here. Assuming there aren't bugs in the fixup code, it should be able to fix things. If someone manually goes and breaks things again, then sure it fixes them again later, but there's really no automatic looping. Also Dovecot already does index file fixing if it notices corruption, so this won't be all that much different. > If there is serious data corruption and you have only one file then all > operations are paused while the self healing is trying to figure out > what went wrong There will be multiple files even per user, but yes, if corruption is noticed then the user is blocked until the corruption is fixed. > (and what happens if different servers decide to do > self-healing on this one file at the same time?). The same as if two processes in one server decide to self-heal: Locking prevents it from happening. signature.asc Description: This is a digitally signed message part
Re: [Dovecot] dbox redesign
I would add that having fewer, larger files should make backups much more feasible. There's a certain amount of overhead for each file operation (especially for us GFS people!) and reducing the number of files will reduce that overhead. Right now our backups (done via rsync) take a pretty scary amount of time, only to get worse as the size of the mailstore (currently 200G) grows. Personally I'm pretty excited about dbox. Allen Timo Sirainen wrote: On Wed, 2009-02-11 at 14:32 -0800, Seth Mattinen wrote: Timo Sirainen wrote: This is about how to implement multiple msgs/file dbox format. The current v1.1's one msg/file design would stay pretty much the same and it would be compatible with this new design. Out of curiosity, what's the advantage to going to multiple messages per file? Wouldn't this have the same problems as mbox? Multiple per file, not everything in one file. As long as the file size is set "right", it's probably faster than one per file. We'll see :) -- Allen Belletti al...@isye.gatech.edu 404-894-6221 Phone Industrial and Systems Engineering404-385-2988 Fax Georgia Institute of Technology
Re: [Dovecot] dbox redesign
Hi Timo I have a few comments. Please just disregard them if I have misunderstood your design. Regarding your storage plan I find it very important that users can be stored in different locations because: 1. Discount users could be placed on cheap storage while others are offered premium service on expensive hardware 2. It's easy to scale if you just add another LUN from your SAN or mount from NAS 3. In order to avoing huge directories you can put users into subdirs with each subdir containing only say 1000 users each All this is very easy to achieve in 1.1 because you can return individual storage dirs for indexes and data from the user db. I'm not sure from reading your post whether this will still be possible but I believe it’s a very important thing. Regarding 7. I very much for all the self healing you describe. There is nothing worse than huge complex systems that fail just because of some minor error that could easily be fixed without manual intervention. But also I'm a little worried in this regard. Maildir is so robust that nothing can really go wrong. But here you have index files and data files located in different places. Imagine the index file being on one NFS mount whilst the data resides on another. Or if the administrator is purposely loading a different index file or data file from a backup. Worst case scenario is that the self healing takes a manual operation for a failure and breaks something. It should be very resilient to temporarily losing access to all files in this operation (could happen very often on NFS mounts). Also I imagine the self-healing going into loops if it doesn't understand what’s going on. If the data changes dues to manual intervention or par of the file system can be accessed you could imagine the self healing process trying again and again to fix something that isn't its job to fix. In that case it would be better if it just skipped the apparent failures. Timo wrote: >I'm also wondering if it's better for each mailbox to have its separate >dovecot.index.cache file or if there should be one cache file for the >map index. I think you should consider more files as the general choice (not only regarding cache files). Imagine many dovecot servers accessing the same storage simultaneously. I figure it would be a lot easier if they weren’t all trying to read/update one essential file at the same time (with only one file, load can’t be spread across multiple mounts and everything goes down if the mount with the essential file is inaccessible). If there is serious data corruption and you have only one file then all operations are paused while the self healing is trying to figure out what went wrong (and what happens if different servers decide to do self-healing on this one file at the same time?). With one file per maildir only a small portion of the users are affected, the load is spread and really bad file corruption doesn’t break everything for thousands of users. Other than that I’m just really glad that dbox is progressing. I consider it the feature. Dbox is the email administrator’s wet dream. I’m already dreaming of completely avoiding the scalability issues of large Maildirs (which is the biggest challenge today in my opinion) and reducing the IO. Buying more IO is an order of magnitude more expensive than getting more RAM or CPU power (and dovecot barely needs any RAM and CPU anyway). Best wishes, Mikkel
Re: [Dovecot] dbox redesign
On Wed, 2009-02-11 at 17:35 -0500, Timo Sirainen wrote: > On Wed, 2009-02-11 at 14:32 -0800, Seth Mattinen wrote: > > Timo Sirainen wrote: > > > This is about how to implement multiple msgs/file dbox format. The > > > current v1.1's one msg/file design would stay pretty much the same and > > > it would be compatible with this new design. > > > > > > > Out of curiosity, what's the advantage to going to multiple messages per > > file? Wouldn't this have the same problems as mbox? > > Multiple per file, not everything in one file. As long as the file size > is set "right", it's probably faster than one per file. We'll see :) Also there are no locking issues since reading doesn't require locking and write locks are very short lived. Corruption isn't possible because data is never copied within a file. A crash can happen at any point and Dovecot will be able to recover from it 100%. The worst that can happen is that some extra garbage is left lying around for some time wasting disk space. signature.asc Description: This is a digitally signed message part
Re: [Dovecot] dbox redesign
On Wed, 2009-02-11 at 14:32 -0800, Seth Mattinen wrote: > Timo Sirainen wrote: > > This is about how to implement multiple msgs/file dbox format. The > > current v1.1's one msg/file design would stay pretty much the same and > > it would be compatible with this new design. > > > > Out of curiosity, what's the advantage to going to multiple messages per > file? Wouldn't this have the same problems as mbox? Multiple per file, not everything in one file. As long as the file size is set "right", it's probably faster than one per file. We'll see :) signature.asc Description: This is a digitally signed message part
Re: [Dovecot] dbox redesign
Timo Sirainen wrote: > This is about how to implement multiple msgs/file dbox format. The > current v1.1's one msg/file design would stay pretty much the same and > it would be compatible with this new design. > Out of curiosity, what's the advantage to going to multiple messages per file? Wouldn't this have the same problems as mbox? ~Seth
[Dovecot] dbox redesign
This is about how to implement multiple msgs/file dbox format. The current v1.1's one msg/file design would stay pretty much the same and it would be compatible with this new design. dbox directories with multiple msgs/file would be like: ~/dbox/storage/ has the actual mail data for all mailboxes ~/dbox/mailboxes/ has subdirectories containing mailboxes and their indexes Also since dbox supports already the single msg per file, those files would be stored in the mailboxes/ directory. So the idea would be that either you use multiple msgs per file using a global storage, or you use single msg per file without a global storage (or it's also possible to be in a mixed setup with some mails in storage/ and some in mailboxes/, mainly to allow migration between those configurations). The storage/ directory would have a new "map index" which is a regular dovecot index (dovecot.index and dovecot.index.log). So the mailbox index would point to mails using an intermediary "map UID". This way if mails are moved to another file only the map index needs to be updated. GUID would be a globally unique 128 bit ID for messages. So if map indexes get corrupted for any reason it's possible to rebuild it by finding the mails using GUIDs. v1.1 dbox has this "dbox.index" file which I was originally planning on using with multiple msgs/file. It had complex file range locking stuff. Now I'm thinking that it's pretty much useless. The only reason for its existence with the new design is for listing metadata for files converted from Maildir. Map index record would contain: - 32 bit map UID - 8 bit flags (MAIL_DELETED flag = message marked as expunged) - 8 bit unused wasted space - 16 bit refcount - 32 bit file sequence - 32 bit file offset --> total 128 bits/msg Mailbox index: - IMAP UID, flags, keywords, etc. - 32 bit map UID - 128 bit GUID dbox file metadata: - 128 bit GUID - size, vsize, received time, saved time, etc. - initial mailbox name (if all indexes get trashed, we can still figure out at least one mailbox where to put the mail. copies would get lost though.) (- no map UID, no imap UID) How to save a message with multiple msgs/file: 1. Find dbox file where to append to: 1.1. Look up the last message from map index 1.2. Is the file "too old"? (or doesn't exist at all) - Yes -> Create new dbox file 1.3. Is the file "too large"? - Yes -> Look at the previous file (one sequence less) and goto 1.2. 1.4. Try to lock the file. - Fail -> Look at prev file and goto 1.2. Now we have a locked/new dbox file where we can write to. Because 1.4. step only tries to lock the file, there's no waiting on locks. This also means that if e.g. two processes are writing new messages rapidly they may be appending actively to two different files. I don't think that's a problem, better than waiting for locks. 2a) We're using an existing file and we need to find the append offset. Since we found the file by finding the last msg in the file, we also know the last message's offset. I wasn't really planning on saving the message sizes in the index file, so to get the append offset I guess it needs to do an extra read on the last msg's header to find the size and skip over it. Hmm. Or would it be less disk I/O to store the size on the index so it could be found directly? I'm not really sure.. In any case, after we find the append offset, check to see if it's at EOF. If not, that means that either another process just saved a new message there or a process crashed previously and left garbage lying around. Refresh the map index to see if this file+offset exists in it. If not, truncate the file and just continue writing there. If it exists, figure out the new append offset and see again if the file limit would be reached. If the file would become too large, unlock the file and goto step 1. 2b) We're writing to a new file. No need to worry about anything in 2a) 3. Write the message and its metadata to dbox file (including generated 128 bit GUID). 4. Assign map UIDs for the written mails and write APPEND records to map index's transaction log. The record would contain the map UID, file seq, offset, refcount=1. The transaction is saved with a "weak" flag (wonder if there's a better name for this) and its offset is remembered. - If we're creating a new dbox file, it's assigned the file seq and rename()d to the final file name while the map index is locked. 5. Write APPEND record to mailbox index's transaction log with IMAP UID, map UID and GUID (and flags, keywords, etc). 6. Write "commit offset=x" record where x is the offset remembered in step 4. This marks the 4's weak transaction as being fully finished. 7. dbox file is unlocked (if we weren't creating a new file). When reading the index and we see a weak transaction without a commit record, call a resolve() function in dbox code. It finds the dbox file in the weak transaction and tries to lock it. If it can't lock it, it (probably) means that there's still a process
Re: [Dovecot] dbox redesign
Am Samstag, 19. Mai 2007 schrieb Timo Sirainen: > 1) Have another human readable mailbox ID <-> name mapping file which > is used if the binary index is corrupted. If mailboxes are > created/deleted/renamed often, this would just slow things down. Might > be a good idea optionally though. The Mailbox structure usually is not changed that often. Maybe just provide a way to dump/export the current mapping to a specially formatted text file and a way to manually load/import a provided dump file. This way, administrators can configure daily cron jobs to dump the current mailbox state and if a mapping really gets lost, a "pretty good" mapping could be reconstructed without any runtime penalty. > 2) If the ID <-> name mapping is lost, the mailboxes could be created > using those IDs as their names. Yes, for example, with the option to overwrite this synthesized mapping with the latest dump. Greetings, Gunter -- *** Powered by AudioScrobbler --> http://www.last.fm/user/Interneci/ *** 15:30 | Within Temptation - The Promise 15:24 | Within Temptation - Mother Earth 15:19 | Within Temptation - Ice Queen 14:21 | Within Temptation - What Have You Done (Rock Mix) *** PGP-Verschlüsselung bei eMails erwünscht :-) *** PGP: 0x1128F25F *** pgpIFBeMYZBby.pgp Description: PGP signature
Re: [Dovecot] dbox redesign
On Wed, 2007-05-16 at 20:27 +0200, Gunter Ohrner wrote: > Am Mittwoch, 16. Mai 2007 schrieb Timo Sirainen: > > > Yes, I think treating mailboxes similary to keywords is ideal. There > > Except if you want to handle some mailboxes in a special way it's > > easier if they're separated on disk. Such as renaming or deleting > > mailboxes is a lot easier.They're based on filtering rules. I don't > think they support "copying" > messages. So the virtual folders are easily rebuilt by just re-applying > the filters into all the messages. > > Not neccessarily if you add one level of indirection, simply numbering the > mailboxes by index numbers internally and providing a number/name mapping > somewhere. This way, a mailbox can be renamed easily simply by updating > the map, and might by deleted by removing the map entry. Stale index > number may be left in the messages and might cleaned up the next time a > message's folder list is updated or messages are expunged. Right. This would also make it use less space inside the dbox files. There already exists a mailbox list index in v1.1 which contains mailbox ID <-> name mappings. But I'm still a bit concerned of its stability. There are two things that could be done: 1) Have another human readable mailbox ID <-> name mapping file which is used if the binary index is corrupted. If mailboxes are created/deleted/renamed often, this would just slow things down. Might be a good idea optionally though. 2) If the ID <-> name mapping is lost, the mailboxes could be created using those IDs as their names. That would be a lot better than just having all the mails merged into a single mailbox. As additional help, there could be a couple of built-in mailbox IDs for INBOX, Trash and Drafts. Perhaps that could be admin-configurable, but then again adding new IDs could make it conflict with existing ones. Perhaps just a single 1=INBOX would be enough.. The mailbox IDs could have a validity number as well, similar to UIDVALIDITY for message UIDs. That would make sure that it's safe to use the validity+ID combination to uniquely and permanently identify a mailbox, even if the mailbox list mapping was completely rebuilt (in that case it would get a new validity). signature.asc Description: This is a digitally signed message part
Re: [Dovecot] dbox redesign
Am Mittwoch, 16. Mai 2007 schrieb Gunter Ohrner: > > mailboxes is a lot easier.They're based on filtering rules. I don't > think they support "copying" > messages. So the virtual folders are easily rebuilt by just re-applying > the filters into all the messages. Whoops, this yunk should not have been in the message... Looks as if I accidentially middle-clicked somehow... :-/ Greetings, Gunter -- *** Powered by AudioScrobbler --> http://www.last.fm/user/Interneci/ *** 21:54 | The Retrosic - Silence 21:49 | The Retrosic - Deathdealer 21:44 | The Retrosic - Bloodsport 21:40 | The Retrosic - Desperate Youth *** PGP-Verschlüsselung bei eMails erwünscht :-) *** PGP: 0x1128F25F *** pgpzOQNrA3QmO.pgp Description: PGP signature
Re: [Dovecot] dbox redesign
Am Mittwoch, 16. Mai 2007 schrieb Timo Sirainen: > > Yes, I think treating mailboxes similary to keywords is ideal. There > Except if you want to handle some mailboxes in a special way it's > easier if they're separated on disk. Such as renaming or deleting > mailboxes is a lot easier.They're based on filtering rules. I don't think they support "copying" messages. So the virtual folders are easily rebuilt by just re-applying the filters into all the messages. Not neccessarily if you add one level of indirection, simply numbering the mailboxes by index numbers internally and providing a number/name mapping somewhere. This way, a mailbox can be renamed easily simply by updating the map, and might by deleted by removing the map entry. Stale index number may be left in the messages and might cleaned up the next time a message's folder list is updated or messages are expunged. Greetings, Gunter -- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ PEOPLE'S WHOLE LIVES *DO* PASS IN FRONT OF THEIR EYES BEFORE THEY DIE. THE PROCESS IS CALLED 'LIVING'.-- (Terry Pratchett, The Last Continent) +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + PGP-verschlüsselte Mails bevorzugt! + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ pgpUBHMsDA8vO.pgp Description: PGP signature
Re: [Dovecot] dbox redesign
On Wed, 2007-05-16 at 07:47 -0400, Charles Marcus wrote: > >> Although one possibility would be treat mailboxes a bit similarly > >> than keywords. So that when a message is copied to another mailbox, > >> the message in dbox file is updated to contain information that it > >> exists in such and such mailboxes. Hmm. Perhaps that would be good > >> enough, yes. > > > Yes, I think treating mailboxes similary to keywords is ideal. There > > really is no reason to physically separate mailboxes on disk. All > > that is needed is this logical separation if it can be done in a > > reliable way. > > > > Or maybe track this in mailbox-specific index files, and also have a > > corespodning text file that stores a list of messages that are > > contained in that mailbox... similar to maildir's dovecot-uidlist > > file. Then if you lose the index you can rebuild the index from the > > text file. > > This sounds suspiciously like 'virtual folders', that are supported by > both Evolution and Thunderbird... how do they do it? They're based on filtering rules. I don't think they support "copying" messages. So the virtual folders are easily rebuilt by just re-applying the filters into all the messages. signature.asc Description: This is a digitally signed message part
Re: [Dovecot] dbox redesign
Would be nice if copying a message from one mailbox to another wouldn't require actually reading+writing the whole message contents. But I can't really figure out how to implement this without requiring that there is only a single dbox storage which contains the mails for all the mailboxes, and the mailboxes themselves are just Dovecot's index files containing pointers to the dbox storage. The problem with having everything in one storage is that if the index files are broken, the messages can't be placed into correct mailboxes anymore. Although one possibility would be treat mailboxes a bit similarly than keywords. So that when a message is copied to another mailbox, the message in dbox file is updated to contain information that it exists in such and such mailboxes. Hmm. Perhaps that would be good enough, yes. Yes, I think treating mailboxes similary to keywords is ideal. There really is no reason to physically separate mailboxes on disk. All that is needed is this logical separation if it can be done in a reliable way. Or maybe track this in mailbox-specific index files, and also have a corespodning text file that stores a list of messages that are contained in that mailbox... similar to maildir's dovecot-uidlist file. Then if you lose the index you can rebuild the index from the text file. This sounds suspiciously like 'virtual folders', that are supported by both Evolution and Thunderbird... how do they do it? -- Best regards, Charles
Re: [Dovecot] dbox redesign
On Wed, 2007-05-16 at 06:40 -0400, Bill Boebel wrote: > > Although one possibility would be treat mailboxes a bit similarly than > > keywords. So that when a message is copied to another mailbox, the > > message in dbox file is updated to contain information that it exists in > > such and such mailboxes. Hmm. Perhaps that would be good enough, yes. > > > > Yes, I think treating mailboxes similary to keywords is ideal. There > really is no reason to physically separate mailboxes on disk. All > that is needed is this logical separation if it can be done in a > reliable way. Except if you want to handle some mailboxes in a special way it's easier if they're separated on disk. Such as renaming or deleting mailboxes is a lot easier. > Or maybe track this in mailbox-specific index files, and also have a > corespodning text file that stores a list of messages that are > contained in that mailbox... similar to maildir's dovecot-uidlist > file. Then if you lose the index you can rebuild the index from the > text file. Except that such mailbox-messagelist file could also be counted as "index file", and losing it again loses the messages :) That's why I thought saving the mailbox name in the message file's headers would be better. If you then lose the mailbox name, you most likely have lost the message itself as well. Also it makes it easier to restore individual messages from backups. signature.asc Description: This is a digitally signed message part
Re: [Dovecot] dbox redesign
On Sat, May 12, 2007 9:10 am, Timo Sirainen <[EMAIL PROTECTED]> said: > Fast copying > > > Would be nice if copying a message from one mailbox to another wouldn't > require actually reading+writing the whole message contents. But I can't > really figure out how to implement this without requiring that there is > only a single dbox storage which contains the mails for all the > mailboxes, and the mailboxes themselves are just Dovecot's index files > containing pointers to the dbox storage. > > The problem with having everything in one storage is that if the index > files are broken, the messages can't be placed into correct mailboxes > anymore. > > Although one possibility would be treat mailboxes a bit similarly than > keywords. So that when a message is copied to another mailbox, the > message in dbox file is updated to contain information that it exists in > such and such mailboxes. Hmm. Perhaps that would be good enough, yes. > Yes, I think treating mailboxes similary to keywords is ideal. There really is no reason to physically separate mailboxes on disk. All that is needed is this logical separation if it can be done in a reliable way. Or maybe track this in mailbox-specific index files, and also have a corespodning text file that stores a list of messages that are contained in that mailbox... similar to maildir's dovecot-uidlist file. Then if you lose the index you can rebuild the index from the text file. Bill
[Dovecot] dbox redesign
I don't think anyone uses dbox currently, so the whole format could still be redesigned. So I was thinking about doing two major changes: 1. Rely on index files a lot more. The flags are already stored in index files, so there's no need to waste I/O updating them to dbox files all the time. They could still be updated (if indexes get deleted, the flags aren't all gone), but less often. 2. Require fcntl() locking. Currently dbox uses dotlocks which is slow. Cydir could be a good alternative also once index file code is made a bit more robust. Perhaps I could implement single instance attachments for cydir too.. Locking === The current dbox "index" file would be gone. It's pretty useless. Replace it with a whole new index file. Or perhaps it should be called "locks" file or something. The locks file would contain records: . So something like: 1 4645a60f 0 2 N 3 D That would mean that the first file is locked by some process, either for appending or expunging. The 2nd and 3rd files aren't locked. New messages can't be appended to 2nd file anymore. 3rd file is already deleted and this record needs to be removed when rewriting the file. Locking a dbox file for either appends or expunges is done like: 1. See if timestamp is zero - If not, see if it's older than .. let's say a day or so .. a) Yes: Continue to 2. b) No: Assume the file is locked 2. Do fcntl() byte range lock over the record line in the locks file. - If it failed, the record is locked 3. Write the timestamp. 4. Compare stat() and fstat() inodes to see if the file was rebuilt - If yes, reopen the file and goto 1 5. File is now locked. Do the append/expunge. 6. Write timestamp to zero. 7. Unlock the byte range. If a file is locked, append will try another file and expunge will mark the message as expunged instead of actually expunging it yet. Note that the locks file is read without locking. This is safe because data is never moved within the file, and it doesn't matter if the timestamp isn't read correctly always. The timestamp check is only an optimization. Actually I'm not sure if it would be better not to have the timestamp at all. Deleted records will stay in the file until the file is rebuilt. If a deleted record is noticed in the file, the process tries to lock the whole locks file. If it succeeds, it proceeds with writing the non-deleted records to a temporary file and rename()ing it over the locks file. Appending = 1. Find the first file in the locks file that has appendable=0 - If no such file was found, go to "create a new file" logic as described below 2. Lock the file record 3. Verify from the file's headers that this file can actually be appended to - If messages have been expunged from a dbox file, it can't be safely appended to anymore. - Other reasons include eg. configurable max. file size and daily rotations 4. Write the mail 5. Lock locks file's header 6. Get the UID from "next uid" field and update it 7. Unlock the header 8. Unlock the file record 9. Update index file Create a new file logic: 1. Create a temporary file 2. Write the messages there 3. Lock locks file's header (including fstat() / stat() rebuild check) 4. See what the latest file ID is in the file 5. rename() temp fail to msg. 6. Lock locks file for the range of the to-be-written record below 7. Write the new record to locks file 8. Go to step 6 in the original append logic Syncing / expunging === If locks file's header's "next uid" doesn't match the one currently in index file, the appending crashed between steps 6 and 9. Find the new message(s) and append them to index file. If the locks file is completely gone , rebuild it by going through all the msg.* files in the directory. If "expunge counter" (see below) doesn't match in locks file's header vs. index file header, go through all the msg.* files to see if a message exists in multiple files. If found, remove the duplicates. Typically neither of the above happens, so the only thing to do here is to write changes from index file to dbox files. This may mean flag changes once in a while, but most importantly expunges will always be synced. Initially figure out what files require expunging. Try to create a single lock range that includes all of them. If there are non-zero lock timestamps in that range, create multiple ranges. If a file couldn't be locked, the expunge is done by updating expunge-flag in the file. This can be done without locking (see below for flag updates). If a file was successfully locked, the expunging is done by: 1. Update "expunge offset" in file header to the offset of the first expunged message. If expunge offset is non-zero, the file is treated as non-appendable. Also when rebuilding and finding the same message from multiple files, this field is used to figure out which file should be truncated. 2. Copy the rest of the non-expunged messages to a new temporary file and add the file to locks file using