Hello, I run cyrus imap 3.0.x with some private changes.
Sometimes when stop the master process, the master process utilizes one CPU core to 100% for 5 minutes. After the fifth minute, systemd enforces kill -9. When I attach to the maste process, I see that it some janitor does some work, but I have not checked the details. Has anybody experienced this? I have very few users, but one of the users (me) uses many client simuitaneously. Lets say two IMAP clients, making 4-6 connections in parallel and three CalDAV clients, doing estimated 3-6 connections in parallel. The httpd process is behind a proxy and most of the time the proxy server manages to serialize the requests, and in fact a single httpd process handles the requests. At least it is not visible that under normal circumstances there is a second running httpd process. Under normal circumstances I see also a single lmtpd process and many imapd processes. On some days I observe that the IMAP client cannot obtain list of new messages, it just times out. This could because of deadlocks, but it can be because on that particular day the IO is extremely slow and thus the problem is not withn cyrus. Sometimes I observe afterwards that tha INBOX index is being rebuild. Sometimes, after the INBOX index is rebuild things start working. So on such days I suspect that there is some deadlock. Lets say, if there are two or more long-term running lmtpd processes, then I suspect a deadlock. What approach can use to find where the deadlock is and how can get rid of it? I can attach to a process with STRACE, get the current backtrace and variable values with GDB and I can see (eg. with LSOF) which files are opened in which mode. But I do not know what to look for. Or rather, when I try investigating, almost always I see a process rebuiding my INBOX index and after waiting, waiting, waiting, eventually the INDEX is rebuild. How can I find out why the index was broken? Greetings Дилян