Hi,

let me preface the following by saying that I'm an idiot, and I don't blame 
cyrus-imapd in the slightest. However, maybe there is potential for improved 
error handling here, so I'm going to report what happened last night.

We are (still) running cyrus-imapd 2.4.20. I have written a script that moves 
users from one partition to another. It's been working fine for a long time. 
The script lists the users on the source partition, checks if they are active, 
and if they aren't, it executes "rename user/USER user/USER TARGETPARTITION".

Yesterday, I was confused and accidentally managed to run two instances of the 
script simultaneously overnight. This caused many error messages, and also the 
loss of three mailboxes. In most cases no harm appears to have been done. There 
are some "Mailbox already exists" messsages, when the first instance of the 
script had already begun to move a mailbox, but as far as I can tell that 
didn't do any damage. In the case of the three lost mailboxes, there are the 
following messages in the logfile:

Feb  6 02:03:06 xxx.rrz.uni-koeln.de imapv6[95539]: IOERROR: opening 
/var/spool/imap3/K/user/aaa/SOffice/Writer/172.: No such file or directory
Feb  6 02:03:06 xxx.rrz.uni-koeln.de imapv6[95539]: IOERROR: opening 
/var/spool/imap3/K/user/aaa/SOffice/Writer/172.: No such file or directory
Feb  6 02:36:39 xxx.rrz.uni-koeln.de imapv6[122884]: IOERROR: opening 
/var/spool/imap3/P/user/bbb/sent-mail/1.: No such file or directory
Feb  6 02:36:39 xxx.rrz.uni-koeln.de imapv6[122884]: IOERROR: opening 
/var/spool/imap3/P/user/bbb/sent-mail/1.: No such file or directory
Feb  6 04:49:39 xxx.rrz.uni-koeln.de imapv6[77903]: IOERROR: opening 
/var/spool/imap3/S/user/ccc/Templates/1.: No such file or directory
Feb  6 04:49:39 xxx.rrz.uni-koeln.de imapv6[77903]: IOERROR: opening 
/var/spool/imap3/S/user/ccc/Templates/1.: No such file or directory

/var/spool/imap3 is the source partition. My best guess is that for those three 
mailboxes the two concurrent renames happened exactly at the same time, so that 
there was some race condition with locking the mailbox(?). Anyway, these three 
mailboxes were just gone this morning. They were neither on the old partition 
nor the new, but interestingly in mailboxes.db they are listed on the new 
partition. I have recreated the mailboxes on disk, so all is fine now.

Again, this is clearly my fault, but perhaps better error handling could avoid 
such scenarios anyway? Perhaps it's already fixed in newer releases? If so 
that's one more incentive to finally upgrade :-)

I have now added a locking mechanism to my script so that it shouldn't be 
possible to run two instances anymore.

Cheers
Sebastian
-- 
    .:.Sebastian Hagedorn - Weyertal 121 (Gebäude 133), Zimmer 2.02.:.
                 .:.Regionales Rechenzentrum (RRZK).:.
   .:.Universität zu Köln / Cologne University - ✆ +49-221-470-89578.:.

<<attachment: Hagedorn.vcf>>

----
Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

Reply via email to