I'm in the final stages of testing a courier IMAP installation, and
am troubleshooting a problem that I experienced when doing some load
testing.
Courier-IMAP is running on FreeBSD 4.11, built from ports, with
Postfix as the MTA and MySQL as the user database, using
courier-authdaemond + MySQL to authenticate. I'm stress-testing POP3
connections with Rabid from the Postal test suite, running on a
different FreeBSD machine. Rabid has worked fine as a way of
stress-testing for me in the past, against qpopper. (However I
recognize that might simply mean that Rabid is behaving in some
non-compliant way that qpopper tolerates better.)
I initially increased settings to MAXDAEMONS=100, MAXPERIP=50 so I
should not in principle have seen problems; I have tried setting Rabid
to varying numbers of threads from 20-40, but always less than the 50
that Courier should in theory permit.
If I use Rabid from a different server to start opening up a bunch of
POP3 connections and reading them, and I just let it run, then after "a
while" (see below) all fresh connections to the Courier POP server
start failing. Rabid has worked fine as a way of stress-testing for me
in the past, against qpopper. (However I recognize that might simply
mean that Rabid is behaving in some non-compliant way that qpopper
tolerates better.)
When I was using 20 threads, 20 POP transactions/minute and
MAXPERIP=50 it would go for 2-3 minutes, then the Rabid log would start
spraying "Server error:" and "Can't establish connection" messages.
With 30 threads, 30 POP transactions/minute, it gets through one minute
and then starts erroring out. With 20 threads, and 100 POP
transactions/minute, it doesn't make it to one minute.
I am pretty sure now that the problem is in the interaction between
Rabid and Courier POP - Rabid seems to think it is closing connections
with "quit\r\n", disconnecting, and opening a fresh connection, but
Courier thinks a large fraction of the earlier "closed" connections are
still connected.
I got to this conclusion by kicking up MAXPERIP to 200, to rule out
the connections per IP as a factor. Wwhen I did that, I saw that Rabid
would see ongoing success at the 20 POP transactions/minute rate; but
at the 5 minute mark, Courier started logging a number of
disconnections with TIMEOUT, when Rabid apparently believed all those
connections were long since closed.
I haven't got as far as tcpdumping the connections yet; I thought I'd
first check if this is known, and if it's a Rabid bug, then what others
have done in terms of performance and stress-testing? I would also be
happy to take suggestions on a better way to stress-test the POP and
IMAP servers. Rabid is a pretty clumsy tool but it's what I'm aware
of.
-- Clifton
--
Clifton Royston -- [EMAIL PROTECTED]
Tiki Technologies Lead Programmer/Software Architect
"I'm gonna tell my son to grow up pretty as the grass is green
And whip-smart as the English Channel's wide..."
-- 'Whip-Smart', Liz Phair
-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
courier-users mailing list
[email protected]
Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users