Hi everybody. I've been hit by a bug in vdelivermail (in maildirquota.c, precisely), and I want to share the resolution I've found.
I've recently upgraded to vpopmail 5.4.33, and we experienced that occasionally (once o twice a day) vdelivermail starts looping, eating all the CPU. An strace on the offending instance resulted in: read(5, 0xf5e4317, 2963227893) = -1 EINVAL (Invalid argument) read(5, 0xf5e4316, 2963227894) = -1 EINVAL (Invalid argument) read(5, 0xf5e4315, 2963227895) = -1 EINVAL (Invalid argument) read(5, 0xf5e4314, 2963227896) = -1 EINVAL (Invalid argument) read(5, 0xf5e4313, 2963227897) = -1 EINVAL (Invalid argument) read(5, 0xf5e4312, 2963227898) = -1 EINVAL (Invalid argument) read(5, 0xf5e4311, 2963227899) = -1 EINVAL (Invalid argument) read(5, 0xf5e4310, 2963227900) = -1 EINVAL (Invalid argument) read(5, 0xf5e430f, 2963227901) = -1 EINVAL (Invalid argument) read(5, 0xf5e430e, 2963227902) = -1 EINVAL (Invalid argument) read(5, 0xf5e430d, 2963227903) = -1 EINVAL (Invalid argument) read(5, 0xf5e430c, 2963227904) = -1 EINVAL (Invalid argument) ... ... The file descriptor number 5 of that process is pointing at the maildirsize file of the mailbox, and is marked as "deleted" in /proc/<procid>/fd We store the mail in a NetApp NFS share, without locking for performance reasons, and I think I've found the problem: the file is deleted by someone else while vdelivermail is reading it. In maildirquota.c, function maildirsize_read, there is a while loop that reads: while (l) { n=read(f, p, l); if (n < 0) { But n is defined as a unsigned int (64 bit) , so even if "read" returns a negative value (error) the "if" is never trigged. So I've made this patch: --- maildirquota.c.orig 2014-01-31 12:21:22.000000000 +0100 +++ maildirquota.c 2014-01-31 12:08:47.000000000 +0100 @@ -337,7 +337,6 @@ int f; char *p; unsigned l; - storage_t n; int first; int ret = 0; @@ -360,15 +359,16 @@ while (l) { - n=read(f, p, l); - if (n < 0) + ssize_t nr; + nr=read(f, p, l); + if (nr < 0) { close(f); return (-1); } - if (n == 0) break; - p += n; - l -= n; + if (nr == 0) break; + p += nr; + l -= nr; } if (l == 0 || ret) /* maildir too big */ { which fixes the problem. Any chance to incorporate the fix in the next version ? Thanks -- Simone Lazzaris QCom S.p.A. !DSPAM:52eb8a1334261024417156!