Re: Cyrus Deadblocking
Adam Tauno Williams wrote: Why? If so it makes more sense to convert your databases to skiplist and see if that helps than to flop library versions. Problem looks to be localized. After switching deliver.db to skiplist format it looks to run more stable (not sure yet, have to wait some time). More about what did found i will write later after i am 100% sure problem is identified. One thing i noticed over all this days, if i completely wipe deliver.db it takes longer to make cyrus processes hang again than just only restart cyrus. Maddly flipping versions seems a poor diagnostic method (if it even qualifies as a diagnostic method). In some special way, you have right. But as example cyrus-sasl crash if it is compiled against 4.3.x versions of Berkeley DB. And works great with 4.5, 4.6 and 4.7 So sometimes trying deifferent version gives some result too. The best approach is to switch to a distribution 1) not acceptable 2) you dont believe realy self what you wrote here, didnt you ? -- Teresa Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus Deadblocking
On Tue, 23 Dec 2008 11:30:31 +0100, Teresa teresa...@myeburg.net wrote: Teresa wrote: reconstruct -r -f user and now run cyrus without squatted. And it seems to work. I have no idea if its on squatter, or on few broken folders. Running stable for about 6 hours now. Ok, latest state. After 13 hours happy running it did hanged again. I did downgraded kernel to 2.6.26.8 and it doesn't changed anything. Behavier the same as with 2.6.27.10. So i think thats because of Berkley DB and glibc-2.9 It still randomly hangs. One of cyrus processes (ipurge, smmapd, imapd or pop3) just hangs, sometimes it take few mins to happend, sometimes few hours, or it can run even whole week. What i did now is update to db-4.7.25 maybe it works more stable with glibc-2.9 i dont know. -- Teresa Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus Deadblocking
Teresa wrote: Hi all, since yesterday i have strange behavier of my productive mail server, and i cannt find the reason for 2 days allready. Ok, new report. I did: reconstruct -r -f user and now run cyrus without squatted. And it seems to work. I have no idea if its on squatter, or on few broken folders. Running stable for about 6 hours now. -- Teresa Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus Deadblocking
Adam Tauno Williams wrote: since yesterday i have strange behavier of my productive mail server, and i cannt find the reason for 2 days allready. Does dmesg show anything odd? Ok guys, it happend again just right now. Exactly same behavier as described befor, but after few days successfully running. I restarted a cyrus and sendmail and attached strace to the lmpd this time. After some time (squatter was still working) it goes to take 100% cpu and doenst answer. Hangs. If you attach to a hung process with strace -p {pid} what does it look like? No idea what it says, but here is my strace output from lmtpd that hangs at the end. Then i sended kill command, what you can see at the last line: http://kvitka.net/log2.strace.txt I did Updated to latest kernel 2.6.27.10 cyrus 1509 0.0 0.0 36508 1940 ?Ss 22:51 0:00 /usr/lib/cyrus/master cyrus 1557 0.0 0.0 70600 652 ?S22:51 0:00 idled cyrus 1578 23.1 10.0 284276 208016 ? R22:51 3:57 squatter -r user cyrus 1583 0.0 0.1 98224 4012 ?S22:51 0:00 imapd -s cyrus 1584 0.0 0.1 98224 3996 ?S22:51 0:00 imapd -s cyrus 1585 0.0 0.1 98224 3912 ?S22:51 0:00 imapd -s cyrus 1586 0.0 0.1 98224 3932 ?S22:51 0:00 imapd -s cyrus 1587 0.0 0.1 98224 4012 ?S22:51 0:00 imapd -s cyrus 1737 0.0 0.0 74956 2048 ?S22:51 0:00 smmapd cyrus 2061 0.3 0.1 98012 3888 ?S23:07 0:00 pop3d -s Now i get a lot of this processes, and everything seems to work again. Let see what happends if i start sendmail... Any idea whats wrong ? -- Teresa Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus Deadblocking
Teresa wrote: Let see what happends if i start sendmail... Steel hangs after some time running... Now sometimes it even damage the DB. I delete all except mailbox.db and it starts again... but not for long... -- Teresa Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus Deadblocking
Adam Tauno Williams wrote: Does dmesg show anything odd? Another thing i get sometimes connecting hanging cyrus process with strace is a lot of : select(0, NULL, NULL, NULL, {0, 25000}) = 0 (Timeout) few per second, and it never ends... -- Teresa Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus Deadblocking
On Mon, 15 Dec 2008 10:42:08 -0200, Henrique de Moraes Holschuh h...@debian.org wrote: On Mon, 15 Dec 2008, Teresa wrote: Which kernel? If it is Linux 2.6.27.8 or 2.6.27.9, try downgrading... Thanks for response. I use 2.6.27.7 vanila kernel (not gentoo-source). Didnt rebooted for about near a mounth. Yesterday i rebooted also in one of last hope that would fix something (i know that doesnt work, and it didnt, it never does :) if something isnt working). Actualy system is raning stable now again. I didnt changed anything, didnt compiled or rebooted. I just restarted cyrus and sendmail few times. And after one of this restarts it run stable. I have no idea why. There is nothing different. No system log messages about broken DB or something else. Once thing i saw strange in this 2 days was : lmtpd[3467] general protection ip:7f2e45ffdb2e sp:7fff4ee81968 error:0 in libdb-4.6.so[7f2e45f2d000+136000] in dmesg. I think this comes from new glibc. But it doesnt breake functionality by now. I have stable working for 4 hours already, system load goes down to 0.0 again. No deadlocking... I saw there is new ebuilds for berkley db 4.7.25 are in portage. Is anybody used this version already ? Maybe compiling cyrus agains this lib will perform better ? Or that looks more like kernel problem ? How i check that ? In htop the process that get 100%cpu load isnt in D state, so its not real deadlock, it just goest in some loop somewhere i suppouse under some condition that doesnt happend allways. Yesterday i also tried to downgrade to 2.3.12_p2 cyrus-imapd. But got same behavier, so i updated to 2.3.13 back again. I will report if i find something more. -- Teresa Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus Deadblocking
On Mon, 15 Dec 2008 08:40:31 -0500, Adam Tauno Williams awill...@whitemice.org wrote: since yesterday i have strange behavier of my productive mail server, and i cannt find the reason for 2 days allready. Does dmesg show anything odd? Not realy. Its quiet, only this strange messages comes in this 2 days also in, thay allways look like that: lmtpd[3467] general protection ip:7f2e45ffdb2e sp:7fff4ee81968 error:0 in libdb-4.6.so[7f2e45f2d000+136000] I didnt changed anything lately, but yesterday my cyrus starts rise cpu But it still work already for 4 hours here, even if this message is once in my dmesg now. If you attach to a hung process with strace -p {pid} what does it look like? Now its run, and as its produktion server, i will leave it running as long it will self :) But next time and i am mostly sure it will come again, i will do that strace. I am running this mailbox already since 2003. Cyrus had some nasty problems with berkleydb few times in the past (2.2.x versions). But for last 2 years i never had realy a problem with it. Did you restart the services after the update? I am on gentoo box. Gentoo is ok, i am self in trouble because i run unstable ~x86_64 keyword. I know that, so i have to manage my problems self. Gentoo has nothing to do with that. But you've right, something with system is not right at the moment. If cyrus goes in to the blocking state, Sounds to be like Cyrus is not the only thing getting hung, which indicates the problem probably lies elsewhere. Actualy only cyrus processes are in trouble. iprune do it job, as example, but never get out to promt. I got my kernel now updated to 2.6.27.9. It runs now 2.6.29.7. If it crash again, i reboot to new kernel and will see if something is changed. -- Teresa Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Cyrus Deadblocking
Hi all, since yesterday i have strange behavier of my productive mail server, and i cannt find the reason for 2 days allready. I didnt changed anything lately, but yesterday my cyrus starts rise cpu load up to 100% and after some time it stop responding. Mostly its a lmtp process, but it happends to pop3 also, or to imapd process self. What helps - restart. There is nothing in the log what would show the problem. All sendmail processes, as they use smmapd for local delivery are blocked also. Ca. 2 weeks ago i updated glibc to 2.9 version. But it worked this two weeks fine. I am on gentoo box. [ebuild R ] sys-libs/db-4.6.21_p3-r1 USE=-bootstrap -doc -java -nocxx -tcl -test 0 kB [ebuild R ] sys-libs/glibc-2.9_p20081201 USE=gd (multilib) nls -debug -glibc-compat20 -glibc-omitfp (-hardened) -profile (-selinux) -vanilla 0 kB [ebuild R ] net-mail/cyrus-imapd-2.3.13 USE=idled pam ssl tcpd (-drac) -kerberos -kolab -nntp -replication -snmp 0 kB I use squater, sieve, imap and pop3. Ipurge starts from cron time to time. If cyrus goes in to the blocking state, and i manualy start ipurge i get message about how much messages will be deleted, how much scanned and etc. but process self never get to promt back. I understand that this description doesnt provide any usefull information, that will help identify problem. If i could identify it, i would already fix it probably. But its my last hope, maybe someone can point me whats wrong ? -- Teresa Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Lowe.rs Prescripti.on Dru.gs On The Internet
zlwrzg.gif