Bug#680363: cyrus-imapd-2.2: imapd gets mad falling into endless loop

2012-07-11 Thread Henrique de Moraes Holschuh
On Tue, 10 Jul 2012, Kiss Gabor (Bitman) wrote:
> #0  0xb7754424 in __kernel_vsyscall ()
> #1  0xb720f96b in poll () from /lib/i686/cmov/libc.so.6
> #2  0xb4d43444 in ldap_int_select (ld=0x91a8638, timeout=0x0) at os-ip.c:1098
> #3  0xb4d2cdbb in wait4msg (ld=0x91a8638, msgid=4, all=0, timeout=0x0, 
> result=0x91afb64) at result.c:335
> #4  ldap_result (ld=0x91a8638, msgid=4, all=0, timeout=0x0, result=0x91afb64) 
> at result.c:120
> #5  0xb4b7091e in ?? () from /lib/libnss_ldap.so.2
> #6  0xb4b70c38 in ?? () from /lib/libnss_ldap.so.2
> #7  0xb4b73625 in _nss_ldap_endgrent () from /lib/libnss_ldap.so.2
> [...]
> Actually I bet function wait4msg() in libraries/libldap/result.c
> in OpenLDAP calls ldap_int_select() in endless loop with
> zero timeout.

The strace output looked like there were pending events on fd 16 for the
select() call.  In that case, it will return immediately...

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#680363: cyrus-imapd-2.2: imapd gets mad falling into endless loop

2012-07-11 Thread Kiss Gabor (Bitman)
> > Actually I bet function wait4msg() in libraries/libldap/result.c
> > in OpenLDAP calls ldap_int_select() in endless loop with
> > zero timeout.
> 
> The strace output looked like there were pending events on fd 16 for the
> select() call.  In that case, it will return immediately...

Henrique,

Actually I cannot debug it more because I had to replace libnss-ldap
with libnss-ldapd. I'm really sorry. Thanks for your efforts.

Gabor
-- 
A mug of beer, please. Shaken, not stirred.



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#680363: cyrus-imapd-2.2: imapd gets mad falling into endless loop

2012-07-05 Thread Gabor Kiss
Package: cyrus-imapd-2.2
Version: 2.2.13-19+squeeze3
Severity: normal

I'm migrating some 40 IMAP users to a new server.
I found that "imapd -s" processes are proliferating and produce high load.
Strace shows that they call poll() continously:

12:40:01.934613 poll([{fd=16, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, -1) = 
1 ([{fd=16, revents=POLLIN|POLLERR|POLLHUP}])
12:40:01.934915 poll([{fd=16, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, -1) = 
1 ([{fd=16, revents=POLLIN|POLLERR|POLLHUP}])
12:40:01.935024 poll([{fd=16, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, -1) = 
1 ([{fd=16, revents=POLLIN|POLLERR|POLLHUP}])
12:40:01.935127 poll([{fd=16, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, -1) = 
1 ([{fd=16, revents=POLLIN|POLLERR|POLLHUP}])
12:40:01.935231 poll([{fd=16, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, -1) = 
1 ([{fd=16, revents=POLLIN|POLLERR|POLLHUP}])
12:40:01.935330 poll([{fd=16, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, -1) = 
1 ([{fd=16, revents=POLLIN|POLLERR|POLLHUP}])
12:40:01.935430 poll([{fd=16, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, -1) = 
1 ([{fd=16, revents=POLLIN|POLLERR|POLLHUP}])
12:40:01.935530 poll([{fd=16, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, -1) = 
1 ([{fd=16, revents=POLLIN|POLLERR|POLLHUP}])
12:40:01.935630 poll([{fd=16, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, -1) = 
1 ([{fd=16, revents=POLLIN|POLLERR|POLLHUP}])


I could catch the moment when an imapd gets mad:

14862 write(16, "\27\3\1\0\2738\276\303\7d\216\274\r\274\303]Z\253\203\217\214\2
51\200\206\240\375\23\371\2109\\\367"..., 192) = 192
14862 time(NULL)= 1341478919
14862 poll([{fd=16, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, -1) = 1 ([{fd=16
, revents=POLLIN}])
14862 read(16, "\25\3\1\0\26", 5)   = 5
14862 read(16, "<\t\357Rp\321\324\331;\206\255v\217\360\20\355\305\271\273U\260\
302", 22) = 22
14862 gettimeofday({1341478919, 642090}, NULL) = 0
14862 getrusage(RUSAGE_SELF, {ru_utime={0, 548034}, ru_stime={0, 124007}, ...}) 
= 0
14862 time(NULL)= 1341478919
14862 times({tms_utime=54, tms_stime=12, tms_cutime=0, tms_cstime=0}) = 
1862420508
14862 gettimeofday({1341478919, 642386}, NULL) = 0
14862 getrusage(RUSAGE_SELF, {ru_utime={0, 548034}, ru_stime={0, 124007}, ...}) 
= 0
14862 time(NULL)= 1341478919
14862 times({tms_utime=54, tms_stime=12, tms_cutime=0, tms_cstime=0}) = 
1862420508
14862 time(NULL)= 1341478919
14862 send(5, "<83>Jul  5 11:01:59 cyrus/imaps2"..., 104, MSG_NOSIGNAL) = 104
14862 rt_sigaction(SIGPIPE, {SIG_IGN, [], SA_RESTART}, NULL, 8) = 0
14862 close(15) = 0
14862 munmap(0xb5025000, 4096)  = 0
14862 rt_sigaction(SIGPIPE, {SIG_IGN, [], 0}, {SIG_IGN, [], SA_RESTART}, 8) = 0
14862 poll([{fd=16, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, -1) = 1 
([{fd=16, revents=POLLIN|POLLERR|POLLHUP}])
14862 poll([{fd=16, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, -1) = 1 
([{fd=16, revents=POLLIN|POLLERR|POLLHUP}])
14862 poll([{fd=16, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, -1) = 1 
([{fd=16, revents=POLLIN|POLLERR|POLLHUP}])
14862 poll([{fd=16, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, -1) = 1 
([{fd=16, revents=POLLIN|POLLERR|POLLHUP}])
14862 poll([{fd=16, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, -1) = 1 
([{fd=16, revents=POLLIN|POLLERR|POLLHUP}])

Its log entries before I killed them manually:

Jul  5 11:01:59 strudel cyrus/imaps2[14862]: accepted connection
Jul  5 11:01:59 strudel cyrus/imaps2[14862]: mydelete: starting txn 2147486956
Jul  5 11:01:59 strudel cyrus/imaps2[14862]: mydelete: committing txn 2147486956
Jul  5 11:01:59 strudel cyrus/imaps2[14862]: mystore: starting txn 2147486957
Jul  5 11:01:59 strudel cyrus/imaps2[14862]: mystore: committing txn 2147486957
Jul  5 11:01:59 strudel cyrus/imaps2[14862]: starttls: TLSv1 with cipher AES256-
SHA (256/256 bits new) no authentication

ltrace -S shows a bit more:

21814 getgrent( 
21814 <... getgrent resumed> )   = 0xb7262b64
21814 getgrent( 
21814 <... getgrent resumed> )   = 0xb7262b64
21814 getgrent( 
21814 <... getgrent resumed> )   = 0xb7262b64
21814 getgrent( 
21814 <... getgrent resumed> )   = 0xb7262b64
21814 getgrent( 
21814 <... getgrent resumed> )   = 0xb7262b64
21814 getgrent( 
21814 SYS_read(15, "", 1024) = 0
21814 SYS_rt_sigaction(13, 0xbf8a7180, 0xbf8a70f4, 8, 0xb7260ff4 
21814 <... SYS_rt_sigaction resumed> )   = 0
21814 SYS_rt_sigaction(13, 0xbf8a7220, 0, 8, 0xb7260ff4 
21814 <... SYS_rt_sigaction resumed> )   = 0
21814 SYS_rt_sigaction(13, 0xbf8a7150, 0xbf8a70c4, 8, 0xb7260ff4) = 0
21814 SYS_stat64(0xb4e364f6, 0xbf8a58ac, 0xb7260ff4, 3, 0xbf8a5938 
21814 <... SYS_stat64 resumed> ) = 0
21814 SYS_geteuid32(0xb4e371f4, 0x4fed83fc, 0x4fb505ee, 0xb4e38614, 0xb4e2d9f0 

21814 <... SYS_geteuid32 resumed> )  = 114
21814 SYS_socketcall(6, 0xbf

Bug#680363: cyrus-imapd-2.2: imapd gets mad falling into endless loop

2012-07-05 Thread Henrique de Moraes Holschuh
Please confirm that this is not caused by the leap-second issues, i.e.
you've seen it on a freshely rebooted server, or prior to the 29th of
june.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#680363: cyrus-imapd-2.2: imapd gets mad falling into endless loop

2012-07-05 Thread Kiss Gabor (Bitman)
> Please confirm that this is not caused by the leap-second issues, i.e.
> you've seen it on a freshely rebooted server, or prior to the 29th of
> june.

I can't.
Just a few hours ago I heard about first time about this kernel bug.
Like enough my problem is caused by it.
Thanks for your help.

Gabor
-- 
No smoke, no drugs, no vindoze.



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#680363: cyrus-imapd-2.2: imapd gets mad falling into endless loop

2012-07-06 Thread Kiss Gabor (Bitman)
> Please confirm that this is not caused by the leap-second issues, i.e.
> you've seen it on a freshely rebooted server, or prior to the 29th of
> june.

Oooops!
I rebooted this morning, but wild imapd processes are raging again. :-(

Gabor



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#680363: cyrus-imapd-2.2: imapd gets mad falling into endless loop

2012-07-10 Thread Kiss Gabor (Bitman)
I could create a core dump. Stack trace:

(gdb) bt
#0  0xb76f5424 in __kernel_vsyscall ()
#1  0xb71b096b in poll () from /lib/i686/cmov/libc.so.6
#2  0xb4ce in ldap_int_select () from /usr/lib/libldap_r-2.4.so.2
#3  0xb4ccddbb in ldap_result () from /usr/lib/libldap_r-2.4.so.2
#4  0xb4b1191e in ?? () from /lib/libnss_ldap.so.2
#5  0xb4b11c38 in ?? () from /lib/libnss_ldap.so.2
#6  0xb4b14625 in _nss_ldap_endgrent () from /lib/libnss_ldap.so.2
#7  0xb71cf9ed in ?? () from /lib/i686/cmov/libc.so.6
#8  0xb7186d4d in endgrent () from /lib/i686/cmov/libc.so.6
#9  0x0809e55c in mynewstate (identifier=) at 
auth_unix.c:261
#10 0x0809e104 in auth_newstate (identifier=0x8696111 "foobar") at auth.c:113
#11 0x080870a6 in mysasl_proxy_policy (conn=0x8695400, context=0x812e6a8, 
requested_user=0x8695d10 "foobar", rlen=6, auth_identity=0x8696111 "foobar", 
alen=6, def_realm=0x0, urlen=0, propctx=0x88762b8) at global.c:546
#12 0x0805c375 in imapd_proxy_policy (conn=0x8695400, context=0x812e6a8, 
requested_user=0x8695d10 "foobar", rlen=6, auth_identity=0x8696111 "foobar", 
alen=6, def_realm=0x0, urlen=0, propctx=0x88762b8) at imapd.c:371
#13 0xb76e0728 in ?? () from /usr/lib/libsasl2.so.2
#14 0xb76e132d in sasl_checkpass () from /usr/lib/libsasl2.so.2
#15 0x0805bb3d in cmd_login (tag=0x0, user=0x86c4478 "foobar") at imapd.c:1785
#16 0x0806071d in cmdloop () at imapd.c:1183
#17 0x08062da8 in service_main (argc=2, argv=0x8690008, envp=0xbfd243a0) at 
imapd.c:691
#18 0x0804dc61 in main (argc=2, argv=0xbfd24394, envp=0xbfd243a0) at 
service.c:533
(gdb) in imapd.c

The bug may be in libldap-2.4-2 or libnss-ldap packages too...
Could you give me some hints? What should I check, recompile, sniff,
debug, peek, poke ... etc?
Thanks.

Gabor



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#680363: cyrus-imapd-2.2: imapd gets mad falling into endless loop

2012-07-10 Thread Kiss Gabor (Bitman)
> > #0  0xb76f5424 in __kernel_vsyscall ()
> > #1  0xb71b096b in poll () from /lib/i686/cmov/libc.so.6
> > #2  0xb4ce in ldap_int_select () from /usr/lib/libldap_r-2.4.so.2
> > #3  0xb4ccddbb in ldap_result () from /usr/lib/libldap_r-2.4.so.2
> > #4  0xb4b1191e in ?? () from /lib/libnss_ldap.so.2
> > #5  0xb4b11c38 in ?? () from /lib/libnss_ldap.so.2
> > #6  0xb4b14625 in _nss_ldap_endgrent () from /lib/libnss_ldap.so.2

Improved trace:

#0  0xb7754424 in __kernel_vsyscall ()
#1  0xb720f96b in poll () from /lib/i686/cmov/libc.so.6
#2  0xb4d43444 in ldap_int_select (ld=0x91a8638, timeout=0x0) at os-ip.c:1098
#3  0xb4d2cdbb in wait4msg (ld=0x91a8638, msgid=4, all=0, timeout=0x0, 
result=0x91afb64) at result.c:335
#4  ldap_result (ld=0x91a8638, msgid=4, all=0, timeout=0x0, result=0x91afb64) 
at result.c:120
#5  0xb4b7091e in ?? () from /lib/libnss_ldap.so.2
#6  0xb4b70c38 in ?? () from /lib/libnss_ldap.so.2
#7  0xb4b73625 in _nss_ldap_endgrent () from /lib/libnss_ldap.so.2
[...]

> 
> Yikes. It is caused by libnss-ldap.
> 
> > The bug may be in libldap-2.4-2 or libnss-ldap packages too...
> > Could you give me some hints? What should I check, recompile, sniff,
> > debug, peek, poke ... etc?
> 
> Usually the only correct way to deal with ldap-nss in the long term is to
> get rid of it :-(

Uhm... :-/

Actually I bet function wait4msg() in libraries/libldap/result.c
in OpenLDAP calls ldap_int_select() in endless loop with
zero timeout.

I tried to enable some debugging as I found on page
http://pic.dhe.ibm.com/infocenter/tivihelp/v2r1/topic/com.ibm.IBMDS.doc/pdguide32.htm
I added environment settings to /etc/init.d/cyrus-2.2:

export LDAP_DEBUG=LDAP_DEBUG_TRACE
export LDAP_DEBUG_FILE=/var/log/ldap.log
if start-stop-daemon ${START} >/dev/null 2>&1 ; then
echo "$NAME."
else

I created a cyrus writable log file called /var/log/ldap.log
then I restarted Cyrus. However the file is remaining empty. :-(

Any idea will be appreciated.

Gabor
-- 
E-mail = m-mail * c-mail ^ 2



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#680363: cyrus-imapd-2.2: imapd gets mad falling into endless loop

2012-07-10 Thread Henrique de Moraes Holschuh
On Tue, 10 Jul 2012, Kiss Gabor (Bitman) wrote:
> I could create a core dump. Stack trace:
> 
> (gdb) bt
> #0  0xb76f5424 in __kernel_vsyscall ()
> #1  0xb71b096b in poll () from /lib/i686/cmov/libc.so.6
> #2  0xb4ce in ldap_int_select () from /usr/lib/libldap_r-2.4.so.2
> #3  0xb4ccddbb in ldap_result () from /usr/lib/libldap_r-2.4.so.2
> #4  0xb4b1191e in ?? () from /lib/libnss_ldap.so.2
> #5  0xb4b11c38 in ?? () from /lib/libnss_ldap.so.2
> #6  0xb4b14625 in _nss_ldap_endgrent () from /lib/libnss_ldap.so.2

Yikes. It is caused by libnss-ldap.

> The bug may be in libldap-2.4-2 or libnss-ldap packages too...
> Could you give me some hints? What should I check, recompile, sniff,
> debug, peek, poke ... etc?

Usually the only correct way to deal with ldap-nss in the long term is to
get rid of it :-(

I am not sure how to help you debug that mess.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org