>On 10/02/2013 05:38 PM, Daniel Schütze wrote:
>> I've been noticing an oddity that on 3.1.7 that on occasion the main
>> dbmail-imapd takes up 100% cpu of a core and the imapd becomes
unresponsive
>> to clients (with the clients outlook/thunderbird themselves becoming
>> unresponsive) for a periods of anything between a few seconds and a
minute.

>The main thread must be too busy to handle connections then, for some
>reason. The 100% CPU clearly indicates a spin-lock.

>> When this happens all over processes on the server (mainly httpd/mysql)
are
>> reported idle.  This situation normally resolves itself and when the cpu
>> load drops the child processes become active again, but is obviously
>> irritating.

>So it recovers. That is 'good', but of course it shouldn't happen to
>begin with.

>> I've noticed our max number of connections for mysql 5.1.x is being hit;
>> originally it was 150 and I've just bumped it to 300 as there are
occasional
>> errors in the dbmail.err file stating "Thread is having trouble obtaining
a
>> database connection. Try [0]" but this is not always the case when the
>> thread starts to use 100% cpu.  I have, however, not managed to catch the
>> listing of connections at a point when mysql reports it is full (for
>> instance right now it reports a mere 16 dbmail connections).

>150 is absurd, at least from dbmail's perspective. Please remember that
>any dbmail process will never use more database connections than
>specified in dbmail.conf:max_db_connections

I haven't actually specified this in my dbmail.conf so I assume I'm using
the default (10).

I also haven't actually ever seen an unusually high number of connections
(on show processlist) even when the 100% cpu usage is going on.  Mysql
reports that there have on occasion been such high connections but I've yet
to see more than about 20 with my own eyes.  

>This value also implies the number of worker threads and should only be
>increased if dbmail indicates it can't obtain a connection. In your case
>apparently it sometimes fails to obtain one, but only on the first try;
>it will retry every second until it does, but will only log a message
>every 5 tries. So if you see those warnings, that is a clear indication
>something needs to be done.

>Either mysql is too busy or slow, so queries take too long. Of course
>some queries take longer than others. Especially IMAP SEARCH and IMAP
>SORT are prone to cause that. Also, backup scripts that generate table
>locks will cause mysql to put queries on hold. mysqladmin will show a
>rapid increase in the number of open connections and running queries
>waiting for the lock to release.

Out production server really doesn't do much other than run the mysql and
dbmail-imapd, our backups are through replication servers and there are no
other files which need to be backed up.

I may be a little naive on this point, the server certainly doesn't "seem"
slow and is running on SSDs with enough of a pool to keep the indices
(5.6gig) in memory and a bit spare.  We will be upgrading to a much faster
machine with 64gig memory (up from 16gig) but this is because we wish to
retire a stand by server rather than because we've noticed anything sluggish
on the production server.

I recently turned on the slow query log on mysql (to queries taking 5
seconds) and so far have only had 2 slow queries (5 and 8 seconds) "out of
hours" (the first during the monthly zfs scrub!).

>Or your dbmail services are just bad-ass busy. If you need to handle
>many concurrent imap clients a max_db_connections of 10 *may* be the
>bottleneck, but carefully test higher values. Too many busy connections
>to mysql may cause actual throughput in the database to decrease as
>lock-contention increases.

I imagine we are on the very small scale user wise.  We have 60 accounts and
of those we have 15 or so desktops (mostly using thunderbird) constantly
connected to the server and at least the same again number of mobile devices
(mainly blackberry's but android phones and iphones too) checking the same
accounts.  The Thunderbird clients have generally been set to use a small
number of cached connections (although some have been set to 0) and have a
mixture of use idle on and off.


>> Users are reporting they notice it when they are moving between folders
but
>> given uses do this all the time it can't be causing it in every instance.
>> 
>> This is on our production machine, of course, but is there anything
>> straightforward I can do to try and diagnose what might be going on?  I'm
>> aware of the previous spinlock threads for earlier versions but I thought
>> that had been resolved.

>I thought so too, but having given this further thought, and reading
>more on libevent, I come to strongly suspect that the self-pipe wasn't
>the problem after all, while removing the self-pipe did decrease
>responsiveness. Time to bring it back and make it a compile-time or
>run-time option.

>You might try the latest snapshot which tries to resolve the spin-lock
bases on strace logs provided my Thomas.

>http://git.dbmail.eu/paul/dbmail/snapshot/dbmail-06787d0866b008b9cd0bc99db0
05abfc7b1cc260.tar.gz

I actually was using the version just before the " IMAP: always disable
read-events during deferred cleanup" change and it hasn't helped with the
memory leak or spinlock.

I've patched in this one additional change and am running it as of now.

>> Also we are clearly seeing the memory leaks in 3.1.7 (700meg memory used
>> after most of a working day) but I assume they are not related to cpu
usage
>> or the process locking up?

>Correct. I'm aware of those, but I can't reproduce the leaks myself. I'm
>running all kinds of scenarios against imapd running under valgrind:
>zero leaks. I've also cleaned up a lot using the coverity scanner. It
>found some possible leaks, and they have been cleaned up.

>Most of the coverity work has been done on the master branch though. So
>I may have missed a backport to dbmail_3_1, but the same leaks are
>probably still present in the master branch as well. Need to rethink
>some architectural decisions.

I can only wish you good luck!  Many thanks.

_______________________________________________
DBmail mailing list
[email protected]
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail

Reply via email to