On 11/04/2011 08:43 PM, Timo Sirainen wrote:
> On Sat, 2011-10-22 at 21:21 +0200, Gordon Grubert wrote:
>> Hello,
>>
>> our dovecot server crashes totally without any really useful
>> log messages. The error log can be found in the attachment.
>> The only way to get dovecot running again is a complete
>> system restart.
> 
> How often does it break? If really a "complete system restart" is needed
> to fix it, it doesn't sound like a Dovecot problem. Check if it's enough
> to stop dovecot and then make sure there aren't any dovecot processes
> lying around afterwards.
Currently, the problem occurred three times. The last time some days
ago. The last "crash" was in the night and, therefore, we used the
chance for a detailed debugging of the system.

You could be right, that it's not a dovecot problem. Next to dovecot,
we found other processes hanging and could not be killed by "kill -9".
Additionally, we found a commonness of all of these processes: They
hanged while trying to access the mailbox volume. Therefore, we repaired
the filesystem. Now, we're watching the system ...

>> Oct 11 09:55:23 mailserver2 dovecot: master: Error: service(imap):
>> Initial status notification not received in 30 seconds, killing the
>> process
>> Oct 11 09:56:23 mailserver2 dovecot: imap-login: Error: master(imap):
>> Auth request timed out (received 0/12 bytes)
> 
> Kind of looks like auth process is hanging. You could see if stracing it
> shows anything useful. Also are any errors logged about LDAP? Is LDAP
> running on the same server?
Dovecot authenticates against postfix and postfix has an LDAP
connection. The LDAP is running on an external cluster. Here,
no errors are reported.

We hope, that the filesystem error was the reason for the problem
and, that the problem is fixed by repairing it.

Best regards,
Gordon

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to