Re: Cyrus Deadblocking

2008-12-27 Thread ::.. Teresa_II ..::
У пт, 2008-12-26 у 10:52 -0800, Scott Likens пише:
 I've been running Cyrus 2.3.13 successfully on Gentoo (amd64/x86_64)  
 for quite some time without any issues.

Like i did... I am on gentoo x86_64 ~amd64 keyword. Never had any
problems with that. Mailserver isnt that big. I have ca. 60 mailboxes
and trafic is near 450 incomming and 350 outgoing messages per hour.

 It's currently linked against bdb 4.6, however I use skiplist for all  
 my databases as I found overall that is much cleaner in the long run.

Yes, thats what worked for me since quite long time. I had sendmail,
cyrus-sasl and spamassassin(with perl libs) compiled against this
version. 

 However, I can honestly say I have never run into your issue with  
 cyrus starting to hang like that.  However, you want to ensure that  
 both cyrus-sasl and imapd are linked to the same version of bdb,  
 otherwise there's issues.

Try to switch deliver db from skiplist to berkeley format and wait some
time until it starts hanging...

 ... So far the point of this email is pretty pointless, but I wanted  
 to say that switching distributions is not ever an acceptable  
 question/answer.

Totaly agree.

 Having more detail from /var/log/messages would be very helpful as  
 cyrus does tend to send debug information to syslog when it's  
 crashing, so we can get more detail of why.

Thats the problem, it just hang. You can see that pretty easy just
trying 
sendmail -bv s...@adresss 
never return to promt, because sendmail wait for smmapd to return from
checking mailbox.
Or just start imap client, it will connect, but never get mails and etc.

Identifieng problem is not that easy, because syslog doesn't show any DB
cuptions, or problems. Dmesg isn't reporting anything wrong and strace
on cyrus processes most time just do no output, or write a lot of
select(0...) timeout. - What is not bad, but normal as i heared. Even if
saying nothing in strace isnt good, it still doesnt help to identifiy
the Problem.

Throw try and catch i found that removing deliver.db and restart cyrus
leads to longer life until one of cyrus processes hangs again.

So what i did, i completely moved cyrus mail to another server. But
after few mins it did same.

I reinstalled new gentoo system with older glibc-2.8 but problem was
same.

Only thing what helps is to add 
duplicate_db: skiplist
to the imapd.conf

It was running stable on this new machine with this settings and
compiled against sys-libs/db-4.6.21_p3-r1 sys-libs/glibc-2.9_p20081201
and sys-devel/gcc-4.3.2-r1

Now i moved back to the old machine with reinstaled system:
sys-devel/gcc-4.3.2-r1
sys-libs/glibc-2.8_p20080602-r1
sys-libs/db-4.7.25_p1-r1

and runs stable too with skiplist as the deliver.db

As soon i switch back from skiplist i can reproduce the problem.

So, i found solution, but i realy can't say whats wrong. I mean i had
this configuration runned since few years allready. Realy didn't changed
anything radicaly in cyrus.

I am happy now with running stable again, but if i can provide some more
info to identify what was wrong, i would like to help.

-- 
Teresa


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html

Re: Cyrus Deadblocking

2008-12-26 Thread Teresa
Adam Tauno Williams wrote:
 Why?  If so it makes more sense to convert your databases to skiplist
 and see if that helps than to flop library versions.
   
Problem looks to be localized. After switching deliver.db to skiplist 
format it looks to run more stable (not sure yet, have to wait some time).
More about what did found i will write later after i am 100% sure 
problem is identified.

One thing i noticed over all this days, if i completely wipe deliver.db 
it takes longer to make cyrus processes hang again than just only 
restart cyrus.

 Maddly flipping versions seems a poor diagnostic method (if it even
 qualifies as a diagnostic method).
   
In some special way, you have right. But as example cyrus-sasl crash if 
it is compiled against 4.3.x versions of Berkeley DB.
And works great with 4.5, 4.6 and 4.7
So sometimes trying deifferent version gives some result too.

 The best approach is to switch to a distribution
1) not acceptable
2) you dont believe realy self what you wrote here, didnt you ?

-- 
Teresa

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Cyrus Deadblocking

2008-12-26 Thread Scott Likens
Hi Teresa,

I've been running Cyrus 2.3.13 successfully on Gentoo (amd64/x86_64)  
for quite some time without any issues.

It's currently linked against bdb 4.6, however I use skiplist for all  
my databases as I found overall that is much cleaner in the long run.

However, I can honestly say I have never run into your issue with  
cyrus starting to hang like that.  However, you want to ensure that  
both cyrus-sasl and imapd are linked to the same version of bdb,  
otherwise there's issues.

... So far the point of this email is pretty pointless, but I wanted  
to say that switching distributions is not ever an acceptable  
question/answer.

Having more detail from /var/log/messages would be very helpful as  
cyrus does tend to send debug information to syslog when it's  
crashing, so we can get more detail of why.

Scott


On Dec 26, 2008, at 4:06 AM, Teresa wrote:

 Adam Tauno Williams wrote:
 Why?  If so it makes more sense to convert your databases to skiplist
 and see if that helps than to flop library versions.

 Problem looks to be localized. After switching deliver.db to skiplist
 format it looks to run more stable (not sure yet, have to wait some  
 time).
 More about what did found i will write later after i am 100% sure
 problem is identified.

 One thing i noticed over all this days, if i completely wipe  
 deliver.db
 it takes longer to make cyrus processes hang again than just only
 restart cyrus.

 Maddly flipping versions seems a poor diagnostic method (if it even
 qualifies as a diagnostic method).

 In some special way, you have right. But as example cyrus-sasl crash  
 if
 it is compiled against 4.3.x versions of Berkeley DB.
 And works great with 4.5, 4.6 and 4.7
 So sometimes trying deifferent version gives some result too.

 The best approach is to switch to a distribution
 1) not acceptable
 2) you dont believe realy self what you wrote here, didnt you ?

 -- 
 Teresa
 
 Cyrus Home Page: http://cyrusimap.web.cmu.edu/
 Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
 List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


 !DSPAM:4954caa2131671804284693!




Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Cyrus Deadblocking

2008-12-24 Thread Teresa
On Tue, 23 Dec 2008 11:30:31 +0100, Teresa teresa...@myeburg.net wrote:
 Teresa wrote:
 reconstruct -r -f user
 
 and now run cyrus without squatted. And it seems to work. I have no idea 
 if its on squatter, or on few broken folders. Running stable for about 6 
 hours now.

Ok, latest state. After 13 hours happy running it did hanged again.
I did downgraded kernel to 2.6.26.8 and it doesn't changed anything.
Behavier the same as with 2.6.27.10.

So i think thats because of Berkley DB and glibc-2.9

It still randomly hangs. One of cyrus processes (ipurge, smmapd, imapd or
pop3) just hangs, sometimes it  take few mins to happend, sometimes few
hours, or it can run even whole week.

What i did now is update to db-4.7.25 maybe it works more stable with
glibc-2.9 i dont know.

-- 
Teresa

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Cyrus Deadblocking

2008-12-24 Thread Adam Tauno Williams
On Wed, 2008-12-24 at 14:27 +0100, Teresa wrote:
 On Tue, 23 Dec 2008 11:30:31 +0100, Teresa teresa...@myeburg.net wrote:
  Teresa wrote:
  reconstruct -r -f user
  and now run cyrus without squatted. And it seems to work. I have no idea 
  if its on squatter, or on few broken folders. Running stable for about 6 
  hours now.
 Ok, latest state. After 13 hours happy running it did hanged again.
 I did downgraded kernel to 2.6.26.8 and it doesn't changed anything.
 Behavier the same as with 2.6.27.10.
 So i think thats because of Berkley DB and glibc-2.9

Why?  If so it makes more sense to convert your databases to skiplist
and see if that helps than to flop library versions.

 It still randomly hangs. One of cyrus processes (ipurge, smmapd, imapd or
 pop3) just hangs, sometimes it  take few mins to happend, sometimes few
 hours, or it can run even whole week.
 What i did now is update to db-4.7.25 maybe it works more stable with
 glibc-2.9 i dont know.

Maddly flipping versions seems a poor diagnostic method (if it even
qualifies as a diagnostic method).

The best approach is to switch to a distribution where things are tested
and shipped in a known-working binary (w/dependencies) built by people
who actually understand what the various compiler options mean, etc...
Your method of shut-gunning various library versions isn't very likely
to lead you to a solution.


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Cyrus Deadblocking

2008-12-24 Thread Adam Tauno Williams
On Tue, 2008-12-23 at 02:44 +0100, Teresa wrote:
 Adam Tauno Williams wrote:
  Does dmesg show anything odd?
 Another thing i get sometimes connecting hanging cyrus process with 
 strace is a lot of :
 select(0, NULL, NULL, NULL, {0, 25000}) = 0 (Timeout)
 few per second, and it never ends...

This above should be pretty normal.  Select polls for any I/O, times out
(because there is nothing to do), and then the process re-issues the
select.  Many services/servers use such a method to handle async I/O.

I'd guess the above is a call to:
   int select(int nfds, fd_set *readfds, fd_set *writefds,
  fd_set *exceptfds, struct timeval *timeout);
 - so the last value, the {0, 25000} is the timeout timeval struct -

struct timeval
  {
__time_t tv_sec;/* Seconds.  */
__suseconds_t tv_usec;  /* Microseconds.  */
  };

 - so you get one of the select(...) calls roughly every 25,000
microseconds since.



Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html

Re: Cyrus Deadblocking

2008-12-24 Thread Adam Tauno Williams
On Mon, 2008-12-22 at 23:11 +0100, Teresa wrote:
 Adam Tauno Williams wrote:
  since yesterday i have strange behavier of my productive mail server, 
  and i
  cannt find the reason for 2 days allready.
  Does dmesg show anything odd?
 Ok guys, it happend again just right now. Exactly same behavier as 
 described befor, but after few days successfully running.
 I restarted a cyrus and sendmail and attached strace to the lmpd this 
 time. After some time (squatter was still working) it goes to take 100% 
 cpu and doenst answer. Hangs.
  If you attach to a hung process with strace -p {pid} what does it look
  like?
 No idea what it says, but here is my strace output from lmtpd that hangs 
 at the end. Then i sended kill command, what you can see at the last line:
 http://kvitka.net/log2.strace.txt

Looks like one of the last things it did was put a message into
user.teresa.Junk and then notify idled that the contents of
user.teresa.Junk had changed.

Nothing very suspicious.


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Cyrus Deadblocking

2008-12-23 Thread Teresa
Teresa wrote:
 Hi all,

 since yesterday i have strange behavier of my productive mail server, and i
 cannt find the reason for 2 days allready.
   
Ok, new report. I did:

reconstruct -r -f user


and now run cyrus without squatted. And it seems to work. I have no idea 
if its on squatter, or on few broken folders. Running stable for about 6 
hours now.

-- 
Teresa


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Cyrus Deadblocking

2008-12-22 Thread Teresa
Adam Tauno Williams wrote:
 since yesterday i have strange behavier of my productive mail server, 
 and i
 cannt find the reason for 2 days allready.
 

 Does dmesg show anything odd?

Ok guys, it happend again just right now. Exactly same behavier as 
described befor, but after few days successfully running.

I restarted a cyrus and sendmail and attached strace to the lmpd this 
time. After some time (squatter was still working) it goes to take 100% 
cpu and doenst answer. Hangs.

 If you attach to a hung process with strace -p {pid} what does it look
 like?
No idea what it says, but here is my strace output from lmtpd that hangs 
at the end. Then i sended kill command, what you can see at the last line:

http://kvitka.net/log2.strace.txt

I did Updated to latest kernel 2.6.27.10

cyrus 1509  0.0  0.0  36508  1940 ?Ss   22:51   0:00 
/usr/lib/cyrus/master
cyrus 1557  0.0  0.0  70600   652 ?S22:51   0:00 idled
cyrus 1578 23.1 10.0 284276 208016 ?   R22:51   3:57 
squatter -r user
cyrus 1583  0.0  0.1  98224  4012 ?S22:51   0:00 imapd -s
cyrus 1584  0.0  0.1  98224  3996 ?S22:51   0:00 imapd -s
cyrus 1585  0.0  0.1  98224  3912 ?S22:51   0:00 imapd -s
cyrus 1586  0.0  0.1  98224  3932 ?S22:51   0:00 imapd -s
cyrus 1587  0.0  0.1  98224  4012 ?S22:51   0:00 imapd -s
cyrus 1737  0.0  0.0  74956  2048 ?S22:51   0:00 smmapd
cyrus 2061  0.3  0.1  98012  3888 ?S23:07   0:00 pop3d -s

Now i get a lot of this processes, and everything seems to work again. 
Let see what happends if i start sendmail...

Any idea whats wrong ?

-- 
Teresa

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Cyrus Deadblocking

2008-12-22 Thread Teresa
Teresa wrote:
 Let see what happends if i start sendmail...
   
Steel hangs after some time running... Now sometimes it even damage the 
DB. I delete all except mailbox.db and it starts again... but not for 
long...

-- 
Teresa

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Cyrus Deadblocking

2008-12-22 Thread Teresa
Adam Tauno Williams wrote:
 Does dmesg show anything odd?
 


Another thing i get sometimes connecting hanging cyrus process with 
strace is a lot of :
select(0, NULL, NULL, NULL, {0, 25000}) = 0 (Timeout)

few per second, and it never ends...

-- 
Teresa

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Cyrus Deadblocking

2008-12-15 Thread Teresa
On Mon, 15 Dec 2008 10:42:08 -0200, Henrique de Moraes Holschuh
h...@debian.org wrote:
On Mon, 15 Dec 2008, Teresa wrote:
Which kernel?  If it is Linux 2.6.27.8 or 2.6.27.9, try downgrading...

Thanks for response.
I use 2.6.27.7 vanila kernel (not gentoo-source). Didnt rebooted for about
near a mounth. Yesterday i rebooted also in one of last hope that would fix
something (i know that doesnt work, and it didnt, it never does :) if
something isnt working).

Actualy system is raning stable now again. I didnt changed anything, didnt
compiled or rebooted. I just restarted cyrus and sendmail few times. And
after one of this restarts it run stable. I have no idea why. There is
nothing different. No system log messages about broken DB or something
else.

Once thing i saw strange in this 2 days was :
lmtpd[3467] general protection ip:7f2e45ffdb2e sp:7fff4ee81968 error:0 in
libdb-4.6.so[7f2e45f2d000+136000]
in dmesg.

I think this comes from new glibc. But it doesnt breake functionality by
now. I have stable working for 4 hours already, system load goes down to
0.0 again. No deadlocking...

I saw there is new ebuilds for berkley db 4.7.25 are in portage. Is anybody
used this version already ? Maybe compiling cyrus agains this lib will
perform better ?

Or that looks more like kernel problem ? How i check that ? In htop the
process that get 100%cpu load isnt in D state, so its not real deadlock,
it just goest in some loop somewhere i suppouse under some condition that
doesnt happend allways.

Yesterday i also tried to downgrade to 2.3.12_p2 cyrus-imapd. But got same
behavier, so i updated to 2.3.13 back again.

I will report if i find something more.
--
Teresa


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Cyrus Deadblocking

2008-12-15 Thread Teresa
On Mon, 15 Dec 2008 08:40:31 -0500, Adam Tauno Williams
awill...@whitemice.org wrote:
since yesterday i have strange behavier of my productive mail server,
and
i
cannt find the reason for 2 days allready.

Does dmesg show anything odd?

Not realy. Its quiet, only this strange messages comes in this 2 days also
in, thay allways look like that:

lmtpd[3467] general protection ip:7f2e45ffdb2e sp:7fff4ee81968 error:0 in
libdb-4.6.so[7f2e45f2d000+136000] I didnt changed anything lately, but
yesterday my cyrus starts rise cpu

But it still work already for 4 hours here, even if this message is once in
my dmesg now.


If you attach to a hung process with strace -p {pid} what does it look
like?

Now its run, and as its produktion server, i will leave it running as long
it will self :)
But next time and i am mostly sure it will come again, i will do that
strace.

I am running this mailbox already since 2003. Cyrus had some nasty problems
with berkleydb few times in the past (2.2.x versions). But for last 2 years
i never had realy a problem with it.

Did you restart the services after the update?
I am on gentoo box.

Gentoo is ok, i am self in trouble because i run unstable ~x86_64 keyword.
I know that, so i have to manage my problems self. Gentoo has nothing to do
with that.
But you've right, something with system is not right at the moment.


If cyrus goes in to the blocking state,

Sounds to be like Cyrus is not the only thing getting hung,  which
indicates the problem probably lies elsewhere.

Actualy only cyrus processes are in trouble. iprune do it job, as example,
but never get out to promt.

I got my kernel now updated to 2.6.27.9. It runs now 2.6.29.7. If it crash
again, i reboot to new kernel and will see if something is changed.

--
Teresa


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Cyrus Deadblocking

2008-12-15 Thread Henrique de Moraes Holschuh
On Mon, 15 Dec 2008, Patrick Boutilier wrote:
 Henrique de Moraes Holschuh wrote:
  On Mon, 15 Dec 2008, Teresa wrote:
  since yesterday i have strange behavier of my productive mail server, and i
  cannt find the reason for 2 days allready.
  
  Which kernel?  If it is Linux 2.6.27.8 or 2.6.27.9, try downgrading...
 
 What is wrong with those kernels?

The lack of this:
http://lkml.indiana.edu/hypermail/linux/kernel/0812.1/00998.html

Thread here:
http://lkml.indiana.edu/hypermail/linux/kernel/0812.1/index.html#6

2.6.27.10 will be much better.  I am not touching 2.6.27 at all until it is
out (still running 2.6.26.y here), but probably I won't consider it until it
reaches 2.6.27.12 or thereabouts.

No, I am not sure it would break Cyrus IMAP.  But one doesn't let Cyrus IMAP
anywhere near a kernel that is suspect of less than pristine shared memory
or mmap behaviour, it would be the same as walking around with dead fish in
a basket, near a bunch of starved cats.

-- 
  One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie. -- The Silicon Valley Tarot
  Henrique Holschuh

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Cyrus Deadblocking

2008-12-15 Thread Teresa
Hi all,

since yesterday i have strange behavier of my productive mail server, and i
cannt find the reason for 2 days allready.

I didnt changed anything lately, but yesterday my cyrus starts rise cpu
load up to 100% and after some time it stop responding. Mostly its a lmtp
process, but it happends to pop3 also, or to imapd process self.

What helps - restart. There is nothing in the log what would show the
problem.

All sendmail processes, as they use smmapd for local delivery are blocked
also.

Ca. 2 weeks ago i updated glibc to 2.9 version. But it worked this two
weeks fine. I am on gentoo box.

[ebuild   R   ] sys-libs/db-4.6.21_p3-r1  USE=-bootstrap -doc -java -nocxx
-tcl -test 0 kB
[ebuild   R   ] sys-libs/glibc-2.9_p20081201  USE=gd (multilib) nls -debug
-glibc-compat20 -glibc-omitfp (-hardened) -profile (-selinux) -vanilla 0
kB
[ebuild   R   ] net-mail/cyrus-imapd-2.3.13  USE=idled pam ssl tcpd
(-drac) -kerberos -kolab -nntp -replication -snmp 0 kB

I use squater, sieve, imap and pop3. Ipurge starts from cron time to time.
If cyrus goes in to the blocking state, and i manualy start ipurge i get
message about how much messages will be deleted, how much scanned and etc.
but process self never get to promt back.

I understand that this description doesnt provide any usefull information,
that will help identify problem. If i could identify it, i would already
fix it probably. But its my last hope, maybe someone can point me whats
wrong ?

-- 
Teresa


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Cyrus Deadblocking

2008-12-15 Thread Patrick Boutilier
Henrique de Moraes Holschuh wrote:
 On Mon, 15 Dec 2008, Teresa wrote:
 since yesterday i have strange behavier of my productive mail server, and i
 cannt find the reason for 2 days allready.
 
 Which kernel?  If it is Linux 2.6.27.8 or 2.6.27.9, try downgrading...
 

What is wrong with those kernels?

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Cyrus Deadblocking

2008-12-15 Thread Henrique de Moraes Holschuh
On Mon, 15 Dec 2008, Teresa wrote:
 since yesterday i have strange behavier of my productive mail server, and i
 cannt find the reason for 2 days allready.

Which kernel?  If it is Linux 2.6.27.8 or 2.6.27.9, try downgrading...

-- 
  One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie. -- The Silicon Valley Tarot
  Henrique Holschuh

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Cyrus Deadblocking

2008-12-15 Thread Adam Tauno Williams
 since yesterday i have strange behavier of my productive mail server, and i
 cannt find the reason for 2 days allready.

Does dmesg show anything odd?

 I didnt changed anything lately, but yesterday my cyrus starts rise cpu
 load up to 100% and after some time it stop responding. Mostly its a lmtp
 process, but it happends to pop3 also, or to imapd process self.
 What helps - restart. There is nothing in the log what would show the
 problem.

If you attach to a hung process with strace -p {pid} what does it look
like?

 All sendmail processes, as they use smmapd for local delivery are blocked
 also.
 Ca. 2 weeks ago i updated glibc to 2.9 version. But it worked this two
 weeks fine. 

Did you restart the services after the update?

 I am on gentoo box.

Oh.

 [ebuild   R   ] sys-libs/db-4.6.21_p3-r1  USE=-bootstrap -doc -java -nocxx
 -tcl -test 0 kB
 [ebuild   R   ] sys-libs/glibc-2.9_p20081201  USE=gd (multilib) nls -debug
 -glibc-compat20 -glibc-omitfp (-hardened) -profile (-selinux) -vanilla 0
 kB
 [ebuild   R   ] net-mail/cyrus-imapd-2.3.13  USE=idled pam ssl tcpd
 (-drac) -kerberos -kolab -nntp -replication -snmp 0 kB

I assume the above is some Gentoo thing.

 I use squater, sieve, imap and pop3. Ipurge starts from cron time to time.
 If cyrus goes in to the blocking state, 

Sounds to be like Cyrus is not the only thing getting hung,  which
indicates the problem probably lies elsewhere.

 and i manualy start ipurge i get
 message about how much messages will be deleted, how much scanned and etc.
 but process self never get to promt back.
 I understand that this description doesnt provide any usefull information,
 that will help identify problem. If i could identify it, i would already
 fix it probably. But its my last hope, maybe someone can point me whats
 wrong ?


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html