Sebastian Hagedorn wrote:

Hi,

we are running the following setup under Red Hat Linux Advanced Server 3:

name       : Cyrus IMAPD
version    : v2.2.12-Invoca-RPM-2.2.12-1.ZAIK 2005/02/14 16:43:51
vendor     : Project Cyrus
support-url: http://asg.web.cmu.edu/cyrus
os         : Linux
os-version : 2.4.21-27.0.2.ELsmp
environment: Built w/Cyrus SASL 2.1.20
            Running w/Cyrus SASL 2.1.20
Built w/Sleepycat Software: Berkeley DB 4.1.25: (August 21, 2003) Running w/Sleepycat Software: Berkeley DB 4.1.25: (August 21, 2003)
            Built w/OpenSSL 0.9.7a Feb 19 2003
            Running w/OpenSSL 0.9.7a Feb 19 2003
            CMU Sieve 2.2
            TCP Wrappers
            mmap = shared
            lock = fcntl
            nonblock = fcntl
            auth = unix
            idle = idled

In earlier versions of Cyrus we experienced problems where processes got stuck and caused subsequent connections to mailboxes to fail due to lock contention. Some work was done to solve this, but I wonder if the success is only cosmetic. It seems to me as if processes still get stuck, it just doesn't keep new connections from working.

I noticed that our server has an ever increasing number of processes. I'm attaching a screenshot of the relevant Ganglia graph for the last month. I see that there are many imapd and pop3d processes that have been running for a long time, i.e. since the middle of May:

[EMAIL PROTECTED] root]# ps -aef|grep pop3
cyrus     1588 22788  0 May13 ?        00:00:03 pop3d -s
cyrus     2810 22788  0 May13 ?        00:00:01 pop3d -s
cyrus    32464 22788  0 May13 ?        00:00:02 pop3d -s
cyrus     7941 22788  0 May13 ?        00:00:00 pop3d -s
cyrus     5331 22788  0 May14 ?        00:00:02 pop3d -s
cyrus     4319 22788  0 May14 ?        00:00:02 pop3d -s
cyrus     9054 22788  0 May14 ?        00:00:00 pop3d -s
cyrus    25309 22788  0 May14 ?        00:00:00 pop3d -s
cyrus     8176 22788  0 May14 ?        00:00:02 pop3d -s
cyrus    21482 22788  0 May14 ?        00:00:00 pop3d
...

All of them seem to be stuck somewhere in SSL, but ultimately in __read_nocancel (). I'll give two examples.

PID 1588:
(gdb) where
#0  0x006d1f0e in __read_nocancel () from /lib/tls/libc.so.6
#1  0x00c16427 in BIO_new_socket () from /lib/libcrypto.so.4
#2  0x00c143e2 in BIO_read () from /lib/libcrypto.so.4
#3  0x007b4c30 in ssl3_alert_code () from /lib/libssl.so.4
#4  0x007b4dcc in ssl3_alert_code () from /lib/libssl.so.4
#5  0x007b60cf in ssl3_read_bytes () from /lib/libssl.so.4
#6  0x007b6ffc in ssl3_get_message () from /lib/libssl.so.4
#7  0x007accab in ssl3_accept () from /lib/libssl.so.4
#8  0x007ac944 in ssl3_accept () from /lib/libssl.so.4
#9  0x007bbcaa in SSL_accept () from /lib/libssl.so.4
#10 0x007b780d in ssl23_get_client_hello () from /lib/libssl.so.4
#11 0x007b7712 in ssl23_accept () from /lib/libssl.so.4
#12 0x007bbcaa in SSL_accept () from /lib/libssl.so.4
#13 0x08051bc3 in shut_down ()
#14 0x0804dda3 in shut_down ()
#15 0x0804ce9d in ?? ()
#16 0x00000001 in ?? ()
#17 0x098eab90 in ?? ()
#18 0x00000000 in ?? ()
(gdb)


21482:
(gdb) where
#0  0x006f4f0e in __read_nocancel () from /lib/tls/libc.so.6
#1  0x00355427 in BIO_new_socket () from /lib/libcrypto.so.4
#2  0x003533e2 in BIO_read () from /lib/libcrypto.so.4
#3  0x0047ae23 in ssl23_read_bytes () from /lib/libssl.so.4
#4  0x00479c61 in ssl23_get_client_hello () from /lib/libssl.so.4
#5  0x00479712 in ssl23_accept () from /lib/libssl.so.4
#6  0x0047dcaa in SSL_accept () from /lib/libssl.so.4
#7  0x08051bc3 in shut_down ()
#8  0x0804dda3 in shut_down ()
#9  0x0804dba8 in shut_down ()
#10 0x0804cde9 in ?? ()
#11 0x095f74d0 in ?? ()
#12 0x0807e79c in config_need_data ()
#13 0x095a5978 in ?? ()
#14 0x0807fff6 in config_need_data ()
#15 0x0807e778 in config_need_data ()
#16 0x08101c40 in ?? ()
#17 0x00000000 in ?? ()
(gdb)

Fortunately these stuck processes don't hold any locks anymore! I understand that I can probably just kill them, but I wonder what the underlying cause of this problem is. Is it likely something in Cyrus or something in the libraries?

Is this only a problem with pop3d or with imapd as well? I can't reproduce your problem here. Is there some kind of proxy or webmail process which might be unfriendly?


--
Kenneth Murchison     Oceana Matrix Ltd.
Software Engineer     21 Princeton Place
716-662-8973 x26      Orchard Park, NY 14127
--PGP Public Key--    http://www.oceana.com/~ken/ksm.pgp
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html

Reply via email to